Vol. XXIII APRIL, 1939 


THE JOURNAL OF 
APPLIED PSYCHOLOGY 


Edited by 
James P. Porter, Ohio University 
Athens, Ohio 





TABLE OF CONTENTS 
W. V. BinauHam. Halo, Invalid and Valid 


Margaret DavipsON AND ANDREW W. Brown. The De- 
velopment and Standardization of the I.J.R. Test for 
the Visually Handicapped 


JOSEPH TIFFIN AND R. J. GREENLY. Employee Selection 


Tests for Electrical Fixture Assemblers and Radio 
Assemblers 


D. B. Lucas anD M. J. Murpuy. False Identification of 
Advertisements in Recognition Tests 


Boyp, R. SHEDDAN AND Louise R. WitMEr. Employment 
Tests for Relief Visitors ....... 


DorotHy T. Dyer. The Relation between Vocational In- 
terests of Men in College and Their Subsequent Occu- 
pational Histories for Ten Years 


JosepH V. Hanna. A Comparison of Cooperative Test 
Seores and High School Grades as Measures for Pre- 
dicting Achievement in College 


Auice LEAHY SHEA AND HELEN RutH Hertz. An Experi- 


ment in Measuring the Effect of Group Instruction for 
Foster Parents 





li CONTENTS 


Notes: 
Epwarp E. Cureton. Note on the Validity of the 
American Council on Education Psychological 
Examination 


FLORENCE L. GoopENovucH. A Comment on Robert L. 
Thorndike’s ‘‘Note on ‘I.Q. Changes in Foster 
Home Children’ by Emmett L. Schott’’ 


News and Notes 
Book Reviews 


New Books and Pamphlets 





HALO, INVALID AND VALID 


W. V. BINGHAM 
Stevens Institute of Technology 


Summary. The well-known halo effect is the tendency for trait ratings 
to reflect in part the rater’s general impression of the person he is rating. 
It complicates every attempt to study the nature and relationships of spe- 
cifie personal traits. It interferes with sharpness of discrimination in 
ratings given to candidates for employment and promotion. But not all 
halo is invalid. This conclusion rests on theoretical considerations with 
respect to the perceptual process and the nature of personality, and finds 
support in data from oral-examination ratings of candidates for civil 
service appointment to administrative, supervisory, and social-work posi- 
tions in the Pennsylvania Department of Public Assistance. | 


Wo a photographer developing a plate finds a halo 


blurring the image, he may suspect that the lens was 

out of focus; or that it had not been shaded from 
glare; or that there was a flaw in the lens, or in the plate. But 
perhaps the camera has recorded an actual halo, as in a photo- 
graph of the solar corona during an eclipse. 


I 


The tendency for specific trait judgments to reflect in part 
the observer’s general impression has commonly been regarded 
only as a troublesome constant error. The phenomenon, de- 
scribed by Wells in 1907, and christened ‘‘the halo effect’’ by 
Thorndike in 1920, has been investigated from many angles. 
Inquiries, admirably summarized by Symonds’ and more re- 
cently by Gordon Allport,? have in part been directed toward 
identification of refractive errors in the observer, his unnoticed 
affects, his lack of training as an interviewer, and the like. Or 
they have concentrated on controllable conditions exterior to 
the observer, such as length of acquaintance and adequacy of 

1P. M. Symonds, Diagnosing Personality and Conduct, Chapter III. 
New York: Appleton, 1931. 

°G. W. Allport, Personality, Chapter XVI. New York: Holt, 1937. 
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opportunities for observation of evidences regarding the traits 
in question. Or they have dealt with differences between traits 
in susceptibility to halo, emphasizing the necessity for caref,| 
selection of traits to be rated, and for the precise definition and 
proper scaling of these traits. But even after observers have 
been cautioned against allowing general impression to mas- 
querade in the guise of specific trait ratings, after they have 
been trained in drawing out and noting behavior indicative of 
the traits to be appraised, and after due allowance has been 
made for the positive correlations known to exist between favor. 
able traits there remains a correlation between overall evalua- 
tion of the person and specific trait ratings—a halo which can- 
not and should not be eliminated because it is inherent in the 
nature of personality, in the perceptual process, and in the very 
act of judgment. 

Consider what transpires when an interviewer rates a per- 
son’s ‘‘Speaking Voice.’’ His judgment is an estimate passed 
upon a configuration in which the trait is but an aspect of a 
personality pattern. The observer’s perception of the trait 
varies not independently but with the ground of which it is an 


aspect. A clear feminine voice may be judged ‘‘excellent”’ 
when coming from a young woman, while the very same high- 
pitched sounds spoken by a husky athletic male are rated 


‘ 


‘‘ludicrous,’’ ‘‘bad,’’ ‘‘requiring drastic re-training.’’ The 
trait is, and should be rated as, a characteristic of the person. 

The ground does not change when the observer’s attention 
shifts from trait to trait of the same individual, from his 
‘‘voice’’ to his ‘‘manner,’’ ‘‘appearance,’’ ‘‘freedom from 
bias,’’ or ‘‘emotional stability.”’ 

In the situation shortly to be described, judgments were made 
under constraints which called for appraisal of specific traits 
not in the abstract but as indications of personal suitability in 
specified situations. The observers therefore were under the 
necessity of comprehending a still wider ground or field which 
included the total personality pictured in rciation to the duties 
or situations specified. This broad ground remaining constant, 
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statistical evidence of halo in ratings made by skilled observers 
is to be expected and welcomed. 

Valid halo, characteristic of correct ratings, should be sharply 
distinguished from the unwanted blur which marks a judgment 
as vague and undiscriminating, carelessly recorded when the 
observer’s attention has been focussed, not on the trait in rela- 
tion to its setting, but on the ground alone. 


II 


This topic was strikingly brought to notice during the early 
stages of a vast program of oral examining carried out in Feb- 
ruary and March, 1938, by a Pennsylvania civil service agency 
known as the Employment Board for the Department of Public 
Assistance. 

There were 60,000 applicants for 5,000 positions, some of 
them clerical. The largest number of openings, however, was 
as social investigator or ‘‘ Visitor’’ so-called, the duties being to 
establish and maintain intimate contact with needy families 
in which are blind persons, aged, dependent mothers, or unem- 
ployed providers, and to dispense public relief funds to these 
families judiciously, in conformity with established policies and 
regulations. In order to carry these responsibilities to the satis- 
faction of clients, taxpayers and public authorities, Visitors 
obviously need certain abilities and personal traits in addition 
to those measured by marks in written tests. Success in super- 
visory and administrative positions likewise is conditioned 
partly by personal factors, some of which it was deemed neces- 
sary to evaluate by means of oral examinations. 

Well-constructed written tests eliminated all but about 11,000 
of the applicants for these administrative, supervisory and 
public-contact positions. Each of these applicants was then 
brought before one of the many boards of oral examiners for 
rating on traits observable during interview and deemed to 
be of value in the position for which he was a candidate. Each 
examiner recorded also an overall rating on the candidate’s 
“personal fitness for the position.’’ These ratings were based, 
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not upon extended acquaintance nor upon casual observation, 
but on such evidences as could be brought to light during an 
informal conference of twenty to forty-five minutes with each 
candidate. 

The 800 examiners had been selected for their knowledge of 
employment practice, or of social work, or of conditions in the 
locality, as well as for general competence and political disinter- 
estedness. The three or five members of each board were in- 
structed in the nature and purposes of the examination and in 
the responsibilities of the position in question. They also re- 
hearsed the procedures to be followed. Practice interviews and 
rating of pseudo-candidates preceded examination of actual 
candidates. 

During this period of training, the nature and dangers of the 
halo effect were emphatically brought to the notice of the 
examiners. They were warned not to follow the easiest way, 
but to draw out during the examination evidences on which to 
base a rating of each separate trait. 

The graphic rating forms used with candidates for major 
administrative positions called for ratings on ten traits: speak- 
ing voice; appearance; command of language; poise, bearing 
and tact; presentation of ideas; freedom from bias; ability to 
plan and organize; ability to direct and supervise; ability to 
interpret the organization to the community; and finally, per- 
sonal fitness for the position. The forms used in rating candi- 
dates for positions as Visitor listed eight traits: voice; appear- 
ance ; language; alertness; ability to present ideas; poise, bear- 
ing and tact; judgment; and personal fitness. Each trait was 
defined as succinctly as possible in a brief paragraph.* 

Some of the rating sheets were hand-scored ; 31,000 of them 
were fed to an International Test Scoring Machine—the first 
time this equipment had been adapted to the purpose of com- 
puting and combining graphic ratings. The machine electri- 

*W. V. Bingham, ‘‘Oral Examinations in Civil Service Recruitment, 


with Special Reference to Experiences in Pennsylvania.’’ Chicago, 
Pamphlet No. 13, Civil Service Assembly, January, 1939. 
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eally reads the rating recorded on each scale, measures its 
numerical value, multiplies this value by its predetermined 
weighting and averages the separate trait ratings all in a 
moment, with inaccuracies of less than one per cent. 

These ratings and certain conclusions toward which their 
analysis poirits are summarized elsewhere.* Attention is here 
drawn to data from two of the examining boards responsible 
for rating 29 candidates who had passed a written examination 
for the two most responsible administrative posts: a $7,500 
position as Executive Director of the Board of Public Assistance 
of Philadelphia County with responsibility for organizing and 
directing a staff of 1,408 employees carrying a case load of 
85,585 families; and the corresponding position in Allegheny 
County where the staff numbers 851, and the case load, 47,968 
families. 

On each of these boards were five examiners, a majority of 
them drawn from other states. The ratings they gave have been 
scrutinized in search of evidences of differences between the 
interviewers in susceptibility to halo, in leniency, in coarseness 


or fineness of discrimination, in the scale-ranges used, and in 
the reliability of the ratings given as measured by deviations 
from the consensus. Differences were also noted between traits 
as to the distribution and reliability of the ratings on them. 
Samples consisting of several hundred of the rating sheets for 
Visitor were also inspected for indications of halo. 


III 


Thanks to the care exercised while the examiners were being 
introduced to their duties and familiarized with rating tech- 
niques, only rare instances were found of flagrant halo, the 
candidate having been given nearly identical ratings on all 
traits. There were, however, a number of cases in which an 
examiner had rated one candidate slightly lower than the con- 
Sensus On every trait and had rated some other candidate higher 

> W. V. Bingham, ‘‘Oral Examinations in Civil Service Recruitment, 


with Special Reference to Experiences in Pennsylvania.’’ Chicago, 
Pamphlet No. 13, Civil Service Assembly, January, 1939. 
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than the consensus on nearly every trait, indicating halo in 
ratings conscientiously recorded. 

Correlations between ratings on the specific traits, and be- 
tween these and the overall ratings on ‘‘ Personal Fitness,”’ are 
all positive and appear to be rather high, although the candi- 
dates for a particular position appearing before any one board 
were too few in number to warrant computing Pearson coeffi- 
cients. Several apparently reasonable schemes of weighting 
the various traits were tried. One of these ignored entirely 
the rating on ‘‘ Personal Fitness,’’ on the ground that it largely 
duplicated the specific trait ratings. Another scheme assigned 
to this rating a weight equal to that of all the specific trait 
ratings combined. The correlation between the ratings on dif. 
ferent traits is such that the rank order of those candidates 
whom the examiners were willing to endorse remains practically 
the same, no matter which traits are included, or which scheme 
of internal weighting is applied. The only changes of relative 
position are among candidates so nearly tied that the differences 
between them are not significant. 

Since ratings on specific traits correlate closely with final esti- 
mates of personal fitness, why should interviewers be asked to 
record the trait ratings? The answer is that an overall judg- 
ment is more likely to be correct if made after the rater’s atten- 
tion has been focussed successively on several of the candidate's 
specific traits. 

It is of interest that the examiners tended to concentrate most 
of their ratings within a relatively narrow range of values on 
the scales for ‘‘Voice,’’ ‘‘Appearance,’’ and ‘‘Command of 
Language’ ’—all of them seemingly easy to judge—while ratings 
on more complex and apparently obscure traits such as ‘‘ Ability 
to Plan and Organize’’ and ‘‘ Ability to Interpret the Organiza- 
tion to the Community’’ had almost twice as great an inter- 
quartile range. 

Nevertheless, there was closer agreement among interviewers 
in their ratings on these latter more complicated traits than on 
the more accessible and apparently more objective traits, as is 
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illustrated in Table 1. When closeness of agreement among 
raters is taken as a measure of relative objectivity, judgments 


TABLE 1 
Relative Reliability of Trait Ratings 


As measured by Average Deviations from the Consensus 


Data from five examiners rating 13 candidates for the executive director- 
ship in one county. Scale length: 10 





Order 
. Voice 

2. Appearance 

3. Command of Language 

. Poise, Bearing and Tact 

. Presentation of Ideas 

5. Freedom from Bias 

. Ability to Plan and Organize 

. Ability to Direct and Supervise 

. Ability to Interpret Organization to the Community 
. Personal Fitness for the Position 
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regarding traits like ‘‘Voice’’ and ‘‘ Appearance’’ are much 
less objective than judgments as to ‘‘ Ability to Plan and Organ- 
ize’’ and ‘‘Personal Fitness for the Position.’’ Probably the 
wider average deviations from the consensus when ‘‘ Voice’’ or 
‘‘Appearance’’ are being rated are traceable in part to greater 
differences among the interviewers in their tastes and standards 
of excellence with respect to these traits, and in part to the fact 
that they gave less time and thought to them, deeming them 
relatively unimportant, as well as easy to rate. The objectivity 
of a trait rating is a function not alone of the accessibility and 
objectivity of the observable evidence. 


IV 


Returning now, in conclusion, to the halo effect, the common 
proclivities of raters to make hasty judgments and to lean un- 
duly on affect and general impression are deplorable. Such 
tendencies should be identified and minimized by every prac- 
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ticable means. But there is a halo that need not be looked at 
askance. Rather, in situations like the one described, it signifies 
that the rater has not abstracted the trait from its setting 
within the personality pattern; and that the person, moreover, 
has been seen against his proper background, namely, the posi- 
tion to be filled. 

Finally, it is not the rater alone whose reactions to the candi- 
date are in question. He is but typical of others—clients, sub- 
ordinates, fellow-employees—who will react to the subject, not 
as a bundle of isolated traits, but as a person with certain duties. 
The judgments and responses of all these people will uncon- 
sciously and inevitably manifest a halo effect which is, in part 
at least, valid. 





THE DEVELOPMENT AND STANDARDIZATION 
OF THE I.J.R. TEST FOR THE VISUALLY 
HANDICAPPED! 


MARGARET DAVIDSON anp ANDREW W. BROWN 


Institute for Juvenile Research, Chicago 


HERE are, for purposes of general clinical use, a large 
number of fairly satisfactory tests for the measurement 
of so-called ‘‘general intelligence.’’ Most children may 

be given any of a large number of individual or group, lan- 
guage or non-language tests intended to determine their 
standing in this particular respect. Unfortunately, however, 
when the practicing psychologist finds himself confronted with 
a child handicapped with respect to sight or hearing, he is not 
so well supplied with instruments. It is the individual with 
a visual handicap with which we have concerned ourselves in 
this study. 

The Hayes-Binet is the one test that has been recognized 
and rather generally used with subjects with visual handicaps. 
This test, however, is designed for use only with the totally 
blind, which means that it is not suitable for the largest group 
of visually handicapped subjects—those with enough vision 
so that they have not learned Braille, and who do not, in gen- 
eral, employ the same techniqu:s for getting along in the 
world as do the totally blind. 

The purpose of this study has been to develop a test by 
which the same thing measured by the conventional intelli- 
gence test may be measured in those who are visually handi- 

1 The test described herein may, for the present, be secured by writing 
to Andrew W. Brown, Institute for Juvenile Research, 907 8. Wolcott 
Avenue, Chicago, Illinois. 

The writers wish to express their thanks to Dr. M. W. Richardson for 


his assistance with some of the technical problems involved in the con- 
struction of the test. 


We wish also to acknowledge clerical assistance from W.P.A. 
Studies from the Institute for Juvenile Research, Chicago, Paul L. 
Schroeder, M.D., Director. Series C, No. 187. 
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capped; not only in those who are blind, but also in those 
whose vision is so defective that they might conceivably be at 
a disadvantage in performing on any of the usual scales. 

An attempt has been made so to select the material and pro- 
cedure that the test can be used with subjects whose vision is 
of any degree whatever. They may be totally blind, or com- 
pletely without visua] handicap, so far as the suitability of the 
test is concerned. The norms tentatively established at this 
time, and those to be adopted later, are, and will be, based on 
the scores of a group that is normal with respect to vision. 

For a number of reasons the norms of the test have been 
based upon groups assumed to be visually normal. First, it 
would be extremely difficult, both in standardizing and in 
using the test, to draw the lines of demarcation between differ. 
ent degrees of visual handicap for purposes of the choice of 
scale. Second, the sampling problem, difficult enough in deal- 
ing with the general population, becomes much more so in 
dealing with such a sub-population as that composed of the 
visually handicapped. In the first place the difficulty of ob- 
taining a sufficiently large number of subjects for purposes of 
establishing norms without going into, institutions for the 
visually handicapped, which is obviously undesirable, would 
be considerable. In the second place, it seems reasonable that 
any norms truly representative of a handicapped group would 
necessarily shift very considerably from time to time and place 
to place as a result of variations in treatment and methods of 
prevention of visual disease. Third, it is more useful to be 
able to compare the intelligence of the handicapped subject 
with that of the visually normal subject, as well as with that 
of others who are handicapped, than to be able to compare 
him only with others who are visually handicapped. And 
finally, if we establish a way of equating the two types of in- 
dividuals with respect to general intelligence it will become 
possible to compare and contrast similar groups of them wit! 
respect to other abilities and traits. 

We have chosen, further, to set up the test in the point scale 
form, wishing to avoid the difficulties and ambiguities to which 
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the use of the mental-age scale tends to lead. As to the general 
method of administration of the test there has been little 
choice, its purpose determining this to a very large extent. It 
is of necessity an individual test. The instructions are given 
orally in accordance with carefully worked out directions, and 
the subject gives his answers orally, the examiner recording 
as much of such answers as is thought necessary. 

The material used in the test is not, for the most part, orig- 
inal. The first selection of tests was made simply by going 
through all available accepted tests of intelligence and select- 
ing from them the type of material which seemed on inspection 
to be of the sort desired; that is, material was selected which 
would probably be equally difficult for children of all degrees 
of vision. This procedure yielded a large quantity of material 
from which to choose. This was then sifted for tests of such a 
type as could be developed into point scales with widely vary- 
ing degrees of difficulty. The tests thus selected are: 


1. Repetition of sentences. 
If. Repetition of digits in reversed order. 


Ill. Vocabulary, concrete. 
IV. Same-opposites. 
V. Absurdities. 
VI. Opposites. 
VII. Differences. 
Word naming. 
Repetition of the thought of a passage. 
Similarities. 
Disarranged sentences. 
Arithmetic problems. 
Logical inferences. 
Analogies. 
Number series completion. 
XVI. Vocabulary, abstract. 
XVII. Comprehension of passages read to subject. 


These tests are all, except for test XVII, conventional and 
frequently used tests of intelligence. They were used here as 
nearly as possible in their usual forms, although in many cases 
it was necessary to alter procedures and methods of adminis- 
tration in order to adapt them to a completely oral mode of 
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administration. Test XVII involves the reading to the sub- 
ject by the examiner of three paragraphs, after each of which 
the subject is asked to answer five or six questions. 

Specific items of each of the tests were so selected that they 
might be, as nearly as could be determined by inspection, 
equally difficult for the blind and the seeing. Various persons 
experienced in dealing with the blind, and several blind peo- 
ple were consulted during this process, and in the selection of 
material their suggestions were followed wherever possible. 
It was necessary, however, to have some further evidence that 
the material was equally well adapted to the several types of 
subjects. The seventeen tests were therefore given to three 
groups of subjects: 65 partially blind pupils in the sight-say- 
ing classes of the Chicago public schools; 45 pupils in the 
classes for the blind in the Chicago publie schools; and 21 
pupils with normal vision in the Illinois Soldiers’ and Sailors’ 
Children’s School. Other ratings were obtained for all mem- 
bers of the three groups; the partially blind were given the 
Stanford-Binet examination—tests being omitted or alternates 
used at the discretion of the examiners in such a manner as to 
obtain what seemed to be the best possible rating; the blind 
were given both the Stanford and Hayes-Binet tests; and 
Stanford-Binet ratings were available for the seeing subjects 
of the Illinois Soldiers’ and Sailors’ Children’s School. The 
blind and seeing groups were matched, the average 1.Q. and 
spread of I1.Q.’s being as nearly as possible equated within 
each age group. 

Comparison of the scores of the three groups on the tests 
seemed to indicate that we were justified in assuming that 
there were no significant differences in the ability of the blind, 
partially blind, and seeing to perform tasks of the type repre- 
sented by the seventeen tests. This conclusion was based on 
two things: (1) a eomparison of the means for the seeing and 
blind groups on each test and computation of the actual differ- 
ence between means with the probable error of the difference; 
(2) the observation of the number of age levels at which either 
group showed a higher average score on each test. In general 
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there was a slight but insignificant tendency for the blind 
group to obtain higher scores than the seeing. 

On the basis of this analysis the data for all three groups 
were pooled and each of the 17 subtests was studied and com- 
pared with the others in order that a smaller number of tests 
might be selected to make up the final battery. The correla- 
tion of each test with Stanford-Binet mental age was calcu- 
lated, inter-test correlations were obtained, and scoring meth- 
ods were studied. Correlation with Stanford-Binet mental 
age was considered most important in the final selection of the 
tests, but other things taken into consideration were: objec- 
tivity of scoring, length of time required to administer, and 
apparent equality of difficulty for the three groups. On these 
bases ten tests were retained: 

Vocabulary (abstract and concrete combined). 
Comprehension. 

Arithmetic problems. 

Repetition of the thought of a passage. 
Repetition of digits reversed. 

Opposites. 

Similarities. 

Disarranged sentences. 

9. Number sequence. 

10. Analogies. 

These subtests were considerably revised, and two forms of 
each were developed. The vocabulary test incorporated into 
the final form is completely different from that in the original 
form. This revision was made quite largely in an effort to use 
words more easy to score. The arithmetic problems test was 
cut into two equivalent halves, as were the opposites, similari- 
ties, and analogies tests. Parallel forms of the comprehension, 
repetition of the thought of a passage, digits, disarranged sen- 
fences, and number sequence tests were devised. 

Each form of the vocabulary test consists of 20 items rang- 
ing in difficulty from such words as table to such words as 
macrocosm. One point of credit is given for each correct 
response. Differential scoring was tried for a time, but it was 
found to be so unreliable that it was dropped. 


1. 
2. 
3. 
4. 
5. 
6. 
7. 
8. 
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Each form of the comprehension test contains 16 questions 
based on three paragraphs of widely varying difficulty. The 
first paragraph consists, in each case, of very simple narrative 
material; the second of expository material; and the third of 
abstract material of considerable difficulty. One point of 
credit is given for each question answered correctly, and in 
several cases partial credit is possible. The scoring of the test 
is fairly easy. 

Each form of test 3 contains ten arithmetic problems vary- 
ing in difficulty from such questions as ‘‘If you have two 
pennies and somebody gives you another, how many pennies 
do you have all together?’’ to ‘‘If a wire twenty inches long 
is to be cut so that one piece is two-thirds as long as the other 
piece, how long must the longer piece be?’’ One point of 
credit is given for each correct response. 

Each form of the repetitions of the thought of a passage test 
consists of five paragraphs which are read to the subject, and 
after each he is asked to repeat in his own words as much of 
the paragraph as he can. The paragraphs vary from ver) 
simple descriptive material to such as that contained in the 
two paragraphs in the old Stanford-Binet examination. For 
scoring purposes each one of the paragraphs is broken up into 
small divisions, each of which is supposed to contain a single 
idea. The subject is given one point of credit for each of 
these divisions the idea of which has been contained in his 
response. The scoring of this test is difficult as it is quite sub- 
jective in nature. However, detailed instructions for scoring 
have been worked out with examples, and if these are care- 
fully studied and followed the test will be decidedly useful, 
even though not perfectly reliable with respect to scoring. 

The test of repeating digits reversed is the conventional test 
of this type. There are eight items in each form, the shortest 
containing combinations of two digits, the longest, nine. One 
point of credit is give: or each item passed. 

Each form of the opposites test contains ten items, varying 
in difficulty from up to pride. One point is given for each 
opposite correctly named. 
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Each form of the similarities test contains ten items varying 
in difficulty from such combinations as apple-peach to such as 
deafness-feeblemindedness-deceitfulness. One point of credit 
is given for each correct response. 

Each form of the disarranged sentences test, contains eleven 
sentences varying in difficulty from such combinations as ‘‘ dog 
the barks’’ to ‘‘eertain always death of cause kinds sickness.’’ 
Partial credit is given on this test in accordance with definite 
rules. Scores of 0, 1, and 2 are possible on each sentence. 

Each form of the number sequence test contains ten items 
varying in difficulty from such items as 5, 10, 15, 20, —, — to 
such as 3, 4, 6, 10, —, —. The subject is asked to give the 
next two numbers in the sequence. For each number which is 
correct and in its correct position one point of credit is given. 
Thus the highest possible score on this test is 20. 

One form of the analogies test contains 11 items, the other 
12. These items vary in difficulty from such analogies as 
Monday : Tuesday :: Friday : to; point: line:: 
line: —-—————--. One point of credit is given for each 
correct response. 





Detailed instructions for the giving and scoring of the test 
have been worked out. The average time required for giving 
each form is probably about an hour, although, as is true of 
most such tests, the younger the subject, the shorter the time 
required. With less mature subjects it is not necessary to give 
all of each subtest. The limits of testing for each test have 
been worked out on the basis of the data on hand. 

The total score is the weighted sum of the ten subtest scores, 
the following weights being given to each: 





Sub-Test Wt. Sub-Test 





6 
7 
8 
9 
10 
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These weights, and most of the remainder of our present 
description of the test, are based on the test scores of 418 sub. 
jects with normal vision. These subjects were for the most 
part pupils in the Chicago public schools. The age distribu. 
tion of the group follows: 





Age Frequency Age Frequeney 





6-0 to 6-5 9 13-6 to 14-5 

6-6 to 7- 22 14-6 to 15-5 

7-6to 8- 13 15-6 to 16-5 

8-6 to 9- 27 16-6 to 17-5 

9-6 to 10-5 37 17-6 to 18-5 
10-6 to 11-5 62 18-6 to 19-5 
11-6 to 12-5 66 19-6 and above 
12-6 to 13-5 55 


ww 


— e DO 
Co # Oo DO Ww CI 





The grade distribution is as follows: 





Grade Frequency Grade Frequency 





21 , 52 
23 ' 56 
19 49 
25 12 
63 9 
69 10 





Of the ten subjects not included in the grade frequency distri- 
bution, six were not at the time in school. Of these, two re- 
ported having completed three years of high school, three 
reported having completed high school, and one had completed 
one year in college. The educational history of the remaining 
four is unknown. 

The following table (p. 237) gives the number of subjects at 
each age level who reported the use of a foreign language in the 
home, and the number of Negro subjects at each age level. 
The only bases for the selection of these cases were that they 
were drawn from schools in neighborhoods in which the popu- 
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Number reporting use 
of language other 
than English 


No. of Negro 
subjects 


6-O0to 6-11 3 
7-0to 7-11 6 
8Oto 8-11 4 
9-0 to 9-11 ~ 6 
10-0 to 10-11 

11-0 to 11-11 

12-0 to 12-11 

13-0 to 13-11 

14-0 to 14-11 

15-0 to 15-11 

16-0 to 16-11 

17-0 to 17-11 

18-0 to 18-11 

19-0 and above 





lation was of approximately average economic status and that, 
in general, children whose grade placements deviated mark- 
edly from that indicated by their chronological ages were 


avoided. It is recognized that the group is not ideal, particu- 
larly in that there is such a disproportionally large number of 
Negro subjects at some age levels. The norms based on the 
group are, of course, only tentative.” 

The weights listed for the ten subtests are based on the inter- 
test correlations for this group and are roughly proportional 
to the quantity 


2 For the revision and extension of the norms ten schools have been 
selected which as far as could be determined from the census report give 
a random sampling of the population. Every seventh child on the teach- 
er’s register in each grade in each school has been examined. Thirteen 
hundred and seven subjects have been given both forms of the test. The 
norms will be based on the age distribution of these cases and the 418 
cases included in this report. 
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where the numerator represents the sum of the correlations of 
test i with all ten tests, its self-correlation being estimated as 
one, and the denominator represents the standard deviation of 
scores for test i. The use of this formula has the effect of 
weighting the subtests in proportion to the extent to which 
they correlate with that which is measured by the test as a 
whole. This formula, and that for the test reliability, which 
follows, were developed by Dr. M. W. Richardson in work as 
yet unpublished. 

The reliability of the test was estimated from the calculated 
correlations between forms A and B of each of the ten sub- 
tests according to the formula 


2 wi?(1—-1%4,) 
i= 


1-Re= ixj 





10 
= +25 = wi Wy My 
isi ij 


a formula for estimating the reliability coefficient of the 
weighted sum of the several subtests. In this formula R,, is 
the reliability estimate, the w,’s are the products of the 


weights for each test i by the standard deviations for the cor- 
responding test i, the r;,’s are the correlations between forms 
A and B of each test i, and the r,;’s are the correlations of 
each test i with every other test j. The reliability coefficient 
for the total test as estimated by this formula is .96. The 
Pearson product-moment correlation between scores on form 
A and form B of the total test is .95. It seems fairly certain, 
then, that the reliability of the test is satisfactory. 

The correlation of scores on form A with chronological age 
is .72; that of form B with chronological age is .70. These 
correlations are based on the 341 subjects ranging in age from 
6—0 through 14-11. 

The standard deviation of the test scores for the total group 
of 418 cases is 59 points. The contribution made by each of 
the ten subtests to the total variance of the test is as follows: 
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% variance . % varis 
ers Sub-test alee earenr 
contributed contributed 


Sub-test 








9.4 
9.2 
11.1 
7.3 10 





8 These figures are based on the formula 


i-10 
6:7 = W, = Wy, Tr, + We D Wy Piet + Wio = Wi Trios 
i=1 i i 


where each term in the right member represents the variance contribu- 
tion of the subtest with corresponding subscript. This formula is 
taken from unpublished material by Dr. M. W. Richardson and was 
secured from him personally. 





EMPLOYEE SELECTION TESTS FOR ELEC- 
TRICAL FIXTURE ASSEMBLERS AND 
RADIO ASSEMBLERS* 


JOSEPH TIFFIN anp R. J. GREENLY 
Purdue University 


N initiating a program of psychological testing in plants 
which have not previously used tests of this type in their 
employment departments, it has seemed advisable in each 

case to ‘‘test the tests’’ on present operators before making a 
definite recommendation as to which tests are suitable for 
specific jobs. The advantages of this procedure over recom- 
mending tests on the basis of a logical analysis of the job are 
several. Tests which do not function are practically certain 
to be eliminated. In the present stage of psychological test- 
ing, a battery of tests selected without such experimentation 
is almost certain to contain, at least occasionally, one or more 
tests which do not function. One such mistake will harm a 
program in the eyes of the management enough to counteract 
the favorable impression created by many correct choices. 
Secondly, it is very difficult for a psychologist who has not 
lived with the job to obtain by observation a knowledge of the 
operations intimate enough to enable him to select from a 
number of possible tests the three or four which are most 
suited to that particular job. Finally, a plant manager or 
superintendent usually insists on seeing the proof of any new 
idea before he installs it. If he can be shown that a competent 
psychologist can come into his plant with a battery of tests and 
select those workers who are known from production records 
or ratings to be most efficient, he will usually agree that the 

* Because of the growing interest of industrial executives and operat- 

ing supervisors in psychological methods of personnel selection, this 
article is addressed primarily to them and deals at greater length with 
certain concepts and statistical procedures than is customary in a profes- 
sional journal. Editor. 
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tests would be equally effective in picking from a group of 
applicants those who would be the most efficient, if given 
employment. 

The procedure, therefore, which underlies the work herein 
described, consists in giving a carefully selected battery of 
tests to present employees and ‘‘testing the tests’’ by deter- 
mining which tests separate the efficient from the inefficient 
operators. 

The present paper describes such experiments conducted on 
three groups of female operators engaged in assembly work. 
The operations of the girls in the three groups may be described 
as follows: 


Group I. Thirty-six operators. Job: Preparing the ends of 
insulated wire for reception of plugs, sockets, or tips. Opera- 
tions of burning, twisting, and soldering, calling primarily for 
hand and arm movements. 

Group II. Thirty-three operators. Job: Placing the plug, 
socket, or tips on the ends of wires, fastening the wire, and 
assembling the plug or socket. 

Group III. Forty-four operators. Job: Assembling radios 
on an assembly line. Placing parts in chassis, connecting and 
soldering wires. 

Groups I and II were made up of operators doing piecework 
on a wage incentive plan of payment. The operators in 
Group III on the assembly line received a flat hourly rate of 
wage payment. : 

The employees tested were called from their work, several 
at a time, to the office of the personnel manager. The purpose 
of the experiments was explained briefly. In order to make 
as clear as possible the fact that the testing was experimental 
and would have no effect on any present employee’s status 
with the company, a mimeographed sheet, containing the fol- 
lowing statement was given to each person tested: 


The Personnel Department is cooperating with 
Purdue University in a series of experiments. If 
you are called to help with this work we want you to 
know that this will in no way affect your present 
standing with the Company. 
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One result of the experiments may be that they will 
uncover abilities in some people that have not been 
known. If this should happen every effort will be 
made to give you the benefit of these results in your 
own work. 

PERsJNNEL DIRECTOR 

The general attitude of the employees toward the work was 
favorable. Quite a few developed a decided interest, particu- 
larly in the vision tests, as the significance of these is more 
apparent than that of some of the other tests given. The fact 
that the vision examination revealed need for glasses in some 
cases, and occasionally indicated that glasses being worn may 
have been incorrectly fitted, also contributed to the general 
interest in the work. 


Four tests were used, namely: O’Connor Finger Dexterity 
Test,’ precision of hand movement test,? Keystone Visual 
Safety Tests* and the Otis Advanced Intelligence Test.‘ 

Let us examine the results on the O’Connor Finger Dex- 
terity Test by means of a simple graphic analysis. The 
standard method of obtaining the score on this test is to de- 


termine the time in seconds required to fill the first half of the 
board plus one hundred and ten per cent of the time required 
to fill the second half of the board, the total being divided by 
two. We found with one hundred operators that the corre- 
lation between scores obtained by this formula and the ‘otal 
time in minutes required to fill the board was +.989. As this 
value is as high, if not higher, than the reliability of the test, 
we discarded the formula and took as the score the total time 
in minutes required to fill the board. 

Figure 1 shows how the operators in the three groups com- 
pare in general on this test with a large number of unselected 

1 Mildred Hines and Johnson O’Connor, ‘‘A Measure of Finger Dex- 
terity, Personnel Journal, 1926, 4, 379-382. 

2A new test briefly described later in this article. Detailed descrip- 
tion to be published soon. 

8 Described in literature of the Keystone View Company, Meadville, 


Pennsylvania. 
4 World Book Company, Yonkers, New York. 
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FicurRE 1. Distributions on finger dexterity test of randomly selected 
persons (top) and three groups of female operators (Groups I, II, III). 
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subjects. Along the baseline of the curve in this figure js 
indicated in minutes the time required to fill the board. The 
height of the curve at each point indicates the per cent of 
persons who were able to fill the board in each of these speci- 
fied times. For example, four people out of a hundred, or 
four per cent, can fill the board in six minutes; nineteen per 
cent require eight minutes and only two per cent require as 
long as twelve minutes. The average of the large group of 
randomly selected persons is 7.99 minutes. 

The distribution of subjects in a randomly selected group 
is shown at the top of Figure 1. This curve was constructed 
from norms published by Bingham.’ The three distributions 
at the bottom show the three groups of operators. In the 
upper distribution the total time ranges from five to fifteen 
minutes, the average, as stated, being 7.99 minutes. Among 
the operators the range is from five to thirteen minutes. The 
averages of the employee groups—7.45, 7.61, and 7.35, respec- 
tively—are slightly lower than the average of 7.99 minutes 
for the heterogeneous group, but the differences both in the 
averages and the general spread of the distributions are rather 
small. We are perhaps safe in concluding that insofar as 
this test actually measures finger dexterity, there are in this 
plant almost as many of ‘‘the lame, the halt, and the blind” 
as would have been obtained if the operators had been selected 
by chance out of the Chicago Telephone Directory! In this 
regard the question will be asked: ‘‘ What of it? Are the 
operators who score high on the dexterity test, that is, those 
who require least time to fill the board, more efficient in their 
work than those at the opposite extreme?’’ 

The answer to this question depends on the type of work 
being done. In Group I, for example, as shown in Figure 2, 
scores on the dexterity test have practically no relation either 
to rated efficiency’ or to actual earnings. In constructing this 
figure, the operators in Group I were divided into four sub- 


5 Bingham, Walter Van Dyke. Aptitudes and Aptitude Testing, p. 
283, Harpers, New York, 1937. 
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FicuRE 2. Finger dexterity of operators in Group I plotted against 
plant experience, age, actual production, rated production and rated qual- 


ity. An example of a type of operation where the dexterity test fails to 
select the better operators. 
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groups or quartiles according to their scores on this test. The 
twenty-five per cent scoring lowest (longest time) made up the 
first group; the twenty-five per cent scoring next, the second 
group ; the twenty-five per cent scoring next, the third group, 
and the twenty-five per cent scoring highest (shortest time 
made up the fourth group. In the lower left-hand corner of 
Figure 2 is graphed the average Standard Hour*® production 
for the last two months of each of these sub-groups. The 
curve is erratic and can be explained only by referring to the 
other curves shown in Figure 2. Looking at the curve in the 
upper left-hand corner, we note that the average plant ex. 
perience of each sub-group tends to agree almost perfectly, 
when plotted as a graph, with the average Standard Hour 
production record. This indicates that it is experience on the 
job, not finger dexterity, that determines the amount of pro- 
ductivity among employees in the Standard Hour group. 
Later in this report we will find further evidence for this 
conclusion. The graphs in the right-hand column of Figure 2 
show that for the four quartiles, ratings on both quantity and 
quality of production follow very closely the amount of ex- 
perience and the actual productivity as taken from Standard 
Hour production records. There is also considerable agree- 
ment between the average age of a sub-group and its experi- 
ence and productivity. Undoutedly the experience curve is 
the key to the situation and we may therefore conclude that it 
is experience on the job, not finger dexterity, which determines 
efficiency for the type of work done by the operators in 
Group I. 

The role of finger dexterity is more important in the work 
done by the second group of operators. Figure 3 shows the 

6 The ‘‘Standard Hour’’ is one of the methods used in industry for 
equating jobs which are not identical, so that a wage incentive plan may 
be used. The operations are studied by time-motion analysis and equated 
so that a certain amount of production in each job is considered the pro 
duction of a ‘‘Standard Hour.’’ If an operator produces 50 per cent 


more than this amount, she is paid for an hour and a half, rather than 
an hour, of work. 
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FicurE 3. Finger dexterity of operators in Group II plotted against 
plant experience, age, rated production, rated quality and actual produc- 
tion. An example of a type of operation where the dexterity test picks 
the fastest operators. 





results for the four quartiles of Group II. The graph in the 
upper right-hand corner—supervisor’s ratings on amounts of 
production against the time required to finish the dexterity 
test—does not rise continuously throughout the entire range, 
but operators in the upper three quartiles are rated higher in 
production than those in the lowest quartile. Directly below 
this is another curve indicating the rated quality of produc- 
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tion for the four quartiles. Here there seems to be little 
tendency for the curve either to rise or to fall with varying 
amounts of finger dexterity. 

In the lower right-hand corner average hourly earnings for 
the last two months are plotted against finger dexterity. This 
is the most objective criterion available. It indicates quite 
definitely that the upper three quartiles in finger dexterity are 
earning more per hour in their present work than those in the 
lowest quartile. This is in agreement with the supervisor's 
ratings on quantity of production and may, perhaps, be taken 
as an indication of the validity of the supervisor’s ratings, 
inasmuch as the latter had no access to the payroll files or 
vouchers. The ratings, therefore, could not have been influ. 
enced by knowledge of the actual earnings. 

In the upper left-hand corner is plotted plant experience in 
months against time required to finish the dexterity test. 
Here we observe that the newer operators do better on the 
finger dexterity test than the older and more experienced 
operators. This is explained by the fact that a somewhat 
similar dexterity test has been in use for several months in 
hiring these operators. The newer operators are therefore 
more dextrous than the older ones. The girls with greater 
finger dexterity have the greatest output, in spite of the fact 
that they have had less than the average amount of experience 
In the lower left-hand corner, age is plotted against time re. 
quired to finish the dexterity test. This curve indicates that 
there is no appreciable relation between age and finger dex- 
terity, within the age limits found in this group. 

In Figure 4, we have a similar graphic presentation of the 
results on the dexterity test for Group ITI. We observe that 
when the four quartiles of this group are successively plotted, 
here again (looking in the upper left-hand corner) the better 
operators score better on this test. Group III contains opera- 
tors working on an assembly line. Ratings on the amount of 
production, therefore, were not available for the members of 
this group, as each worker of the line must do a specified job 
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and, because of the nature of a line operation, can do no more 
nor less than this amount. Neither are there differences in 
average hourly wage or Standard Hour production among the 
employees in this group because all of the operators are paid 


GROUP III 
42 OPERATORS 
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Figure 4, Finger dexterity of operators in Group III plotted against 
pooled ratings of efficiency, plant experience and age. An example of a 
type of operation where the finger dexterity test picks the more careful 
and efficient operators. 


the same hourly rate. We therefore took as the criterion of 
efficiency for this group the pooled ratings of four raters: the 
department foreman, line foreman, former line foreman, and 
personnel manager. 

Each of these raters placed the forty-four operators in rank 
order of general efficiency, thus giving to each operator a num- 
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ber varying from 1 to 44. When the four raters had turned 
in their rankings, the average of the four rankings for each 
operator was computed. In other words, the ratings were 
pooled to increase their reliability. From a statistical point 
of view this method yielded a satisfactory degree of reliability, 
as the average ranking computed from two of the raters cor- 
related +.77 with the average ranking computed from the re. 
maining two. The pooled rankings were taken as measures of 
actual working efficiency of the operators and were used in 
‘*testing the tests.’’ 

On the whole, these pooled ratings probably constitute the 
best single criterion we have, even better than average hourly 
earnings when the operators are paid on a wage incentive plan, 
because average hourly earnings sometimes do not take into 
account differences in the quality of work. The efficiency 
ratings for this group of forty-four operators vary from one 
to forty-four, the best operator being #1 and the poorest, 
#44. The smaller the value, therefore, the better the operator. 

Looking again at the upper left-hand curve in Figure 4, we 
find a consistent tendency for these pooled ratings to improve 
with an increase in dexterity, as measured by the dexterity 
test. In the upper right-hand corner is indicated a tendency 
for newer operators in Group III to test better in dexterity 
than older operators. This is explained by the fact that re- 
cent additions to this assembly line were selected from former 
employees laid off during the depression and therefore consti- 
tute a selected sample of efficient workers. 

At the bottom, we observe there is no appreciable relation 
between age and finger dexterity among the operators in this 
group. 

We may pause here to emphasize one of the fundamental 
principles of a testing program. A test that is very helpful 
in selecting employees for one job or type of job may be of no 
value whatever in selecting operators for other kinds of work. 
This is illustrated in the dexterity test results on these three 
groups. The girls in Group I are primarily occupied with 
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tasks that require hand and arm movements. Few small 
articles that require great finger dexterity are handled. 
Hence the finger dexterity test did not pick the better opera- 
tors in this group. In Group IT and III, on the other hand, 
the girls work primarily with small pieces and utilize many 
finger movements. The finger dexterity test accordingly 
picked the better operators in these two groups. 

Logical analysis of the test and the job will give some idea 
of whether a test is suitable for a given job. But in the last 
analysis, the only way to be sure a test is suitable is to follow 
the procedure we have used here, 1.e., ‘‘test the test’’ on pres- 
ent operators to make sure it actually separates those of high 
productivity from those who are less efficient. 

The method of presenting the results exemplified by the 
preceding illustrations is simple and graphic but has certain 
disadvantages. Most psychologists who are interested in this 
type of work would not sanction simple quartile comparisons 
of the type just presented when there are statistical methods 
available which, although somewhat more complicated, are 
more serviceable in making an accurate and scientific evalua- 
tion of the data. 

Of the numerous statistical devices, the technique of corre- 
lation is the most usable in analyzing these results. In evalu- 
ating tests of this type, where many factors in addition to 
those tested are important in determining working efficiency, 
any positive correlation—+.e., one between zero and + 1.00—may 
be important, even though it is low. High correlations will 
seldom be found because the presence of a high correlation 
would indicate that the factor tested is the only thing deter- 
mining efficiency, and this would be entirely unreasonable. 

Correlations between the test results and the several criteria 
for Group I are given in Table I. The correlations are indi- 
cated in the columns headed by ‘‘r.’’ Opposite each corre- 
lation is its Probable Error. In one sense, the computation of 
Probable Errors is unnecessary in this type of study because 
we are mainly interested in describing the relations which 
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TABLE I 


Correlations between Test Results and Different Criteria of Efficiency 
for 36 Operators in Group I 





Super- Super- Actual Amount 

visor ’s visor ’s amount of plant 
rating on rating on of pro- experi- 

quality quantity duction ence 





r PE, r PE, : PR, 
Dexterity Ss 02 =-31 1 ar - .09 
I.Q. (Otis) ai Jl =<27 10 .: a .02 
Hand precision 63 .07 07 .11 AG . 10 
Visual acuity a ah 06 .15 04 . 01 
Stereopsis 08 115 -.05 .15 -—.02 .18 33 
Vertical imbalance .... (too few failures) 
Far point fusion 03 15 -—.25 .14 -.07 . 11 
Lateral imbal. (far) ‘ a =-@ 26 <-20 . 39 
Lateral imbal. (near) 03 .15 34 13 .03 
Near point fusion....... 34 14 15 14 Re 3 .29 
Ametropia (near)....... 08 .14 pa | 09 .14 
Ametropia (far) .. 18 .15 P 14 .03 
Color vision a 06 .17 , 14 08 .16 
Experience ....... 22 11 70 .06 40 .09 
BIN cosacurearcationnmnick 27 = .10 ‘ 10 an an 
Height 31 .10 ; mS | 02 .11 
Weight nee ak ak ‘ 10 52 .08 





actually exist in the several groups studied, rather than pre- 
dicting what would happen if the samples had been larger. 
Indeed, the ‘‘samples’’ could not have been larger, since all of 
the operators in each group were tested. However, in evalu- 
ating the relative merit of these tests in adding new operators 
to the groups, the Probable Errors would be important. 

Group I included thirty-six operators engaged in piecework 
on a wage incentive plan. In the left-hand column are indi- 
cated the several tests administered: dexterity, intelligence, 
hand precision, and the battery of vision tests. The remaining 
columns present the coefficients of correlation between these 
tests and supervisor’s ratings on quality, supervisor’s ratings 
on quantity, actual amount of production as obtained from 
hourly earnings, and amount of plant experience. 
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‘king at the results on the dexterity test, we find no appre 
correlation with any of the criteria of efticieney. This 
borates our previous graphic analysis by quartiles of this 
». The finger dexterity test is of little value in selecting 
yees for the operations performed by this group of em 
s. The importance of experience for the work is also 
sborated in the fourth row from the bottom, where we find 
ence correlating © .22 with rated quality, © .70 with rated 
tity, and 10 with actual production. 
the second row from the top, the correlations with intelli 
are presented. The intelligence test used was the Otis 
Administering Test of Mental Ability, Higher Examina 
Korm A. Here we find three negative correlations, which 
‘ate that, in general, operators of low ‘‘mental ability’’ as 
d by this test, are on the average better operators, whatever 
-¢riterion, than those who test high. We cannot avoid 
‘conclusion by looking at the experience element, because 
gence correlates only » .02 with experience. This mea 
lower scoring operators have no advantage in terms of 
int of experience on the job. In other words, a standard 
p intelligence test of this type, although used to great 
antage in academic work, probably has little or no value in 
‘ting operators for this particular job. If it has any value 
it should be used in selecting those of low intelligence 
‘ause the negative correlations mean that there is a tendeney 
the operators who, according to this test, are below average 
itelligence to be producing both greater quantity and better 
ty work than those who are above average in intelligence. 
Illustrating this statement, it is interesting to observe that 


ral girls of very low intelligence, according to this test, are 


tually doing very well on the job. For example, one gir! 


se IQ is 69 (which indicates feeble-mindedness according 


Otis’ interpretation of this test score) is making eleven per 


more than the average wage and is doing work which the 
isor rates as better than average in quality. This girl 
xc¢ellent vision and was high in certain other tests whieh 
related to efficiency on the job and will be discussed 
ntly. 
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Another girl, with a tested ' 68, 3 feeble-m 
according to Otis’ norms, was making twenty-two per cent 
than the average wage with her work judged above avera: 
quality. 

In the next row in Table I are the results on the hand 


cision test This test is illustrated in Figure 5. The ope 

















Operator taking the Hand Precision Test. 


is asked to punch criss-cross in the board with a stylus 
rapidly as possible for two and a half minutes, without ma 


‘*too many’ errors. She is left to her own judgment as to t 


proper ratio between speed and accuracy. Each time a pune 


is made, the counter records a score. Kach time the side of t 
board is touched with the stylus by mistake, a second count 
summates the duration of contact with the side. The test 
elves two scores, a production score the number of ) 


in two and a half minutes) and an error seore (the total 
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the stylus is in contact with the side). A problem was en- 
countered in combining the production and error elements in 
scoring this test. The method tentatively decided upon was 
to consider production and errors of equal importance, so that 
an operator at average in both these elements would receive a 
score of fifty. If she is ten per cent above average in produc- 
tion but also makes ten per cent more than the average amount 
of errors, her score is still fifty. If she is at average in produc- 
tion and makes ten per cent fewer errors than average, her 
score is fifty-four. In other words, the operator’s score in- 
creases with added production and decreases with added errors. 
A table has been made out to score the test in this way. 

Returning to Table I, we observe that for the operators in 
Group I, the hand precision test correlated +.63 with super- 
visor’s ratings on quality, +.07 with supervisor’s ratings on 
quantity, and —.16 with per cent of average production. When 
the test in this form is used with the operators of this group 
and is scored as indicated, therefore, it tends to pick more care- 
ful operators, probably at the expense of picking slightly slower 
workers. 

The battery of vision tests was made up of the Keystone 
Visual Safety Tests.’ Table I shows that in Group I few of 

7 The ten parts of the Keystone Visual Safety Tests are: 

1. Visual Acuity. Measures sensitivity of eyes. 

. Stereopsis. Measures ability of the eyes to see depth, i.e., the 
third dimension. 

Vertical Imbalance. Measures whether the muscles hold the two 
eyes in the same horizontal line of vision. 

. Far Point Fusion. Measures whether the images seen by the 
two eyes fuse properly when looking at distant objects. 
(‘*Distant’’ means twenty feet or more.) 

5. Lateral Imbalance (Near). Measures whether the muscles keep 
the two eyes lined up properly on the same vertical line, in con- 
trast to pulling inward (cross-eyedness) or turning outward 
when looking at near objects. 

. Lateral Imbalance (Far). Measures same as 5, except when 
eyes are looking at distant objects. 

. Near Point Fusion. Measures same as 4, when eyes are looking 
at near objects. 
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these tests correlated very highly with any of the criteria, but 
it is interesting to observe that the correlations between ame- 
tropia (both near and far), as well as color vision, are positive 
with supervisor’s ratings of quality, and negative both with 
ratings of quantity and actual amount of production. By 
comparing the average hourly earnings of operators who 
passed the ametropia tests with those who failed, it was found 
that those in the failing group—.e., those who had a fairly 
great degree of near-sightedness, far-sightedness, or astigma- 
tism—were actually making a five per cent greater average 
hourly wage than those who passed these tests. The operators 
with better vision, however, were rated by their supervisors as 
doing a better quality of work, but not so much of it. The 
obvious explanation of this finding (and, to anticipate, the 
same result was found in Group II) is that the operators with 
defective vision cannot do such precise work as those with nor- 
mal vision and hence are able to turn out more production in 
a given time. The plant in which this work was done is now 
considering a different type of inspection system to eliminate 
this penalty under which the girls with normal vision are work- 
ing. 

Some additional correlations which may prove of interest 
are included in Table I. These are concerned with the relation 
of age, height, and weight to efficiency. The older operators 
have greatér productivity than the younger, but their work is 
judged to be poorer in quality. The taller and heavier gir!s 
are better producers than the shorter and lighter girls, but they 
also sacrifice quality in achieving a larger output. 

Table II gives a similar table of correlations for Group I, 
another group of employees engaged in piecework on a wage- 





8. Ametropia (Near). Measures whether there is any appreciable 
degree of refractive error (astigmatism or inadequate focus- 
ing) when-looking at near objects. 

9. Ametropia (Far). Measures same as 8, when eyes are looking 
at distant objects. 

10. Color Vision. Measures whether any appreciable degree of color 
blindness is present. 
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incentive plan. The first four columns of this table are equiva- 
lent to the four columns in Table I. The fifth column gives 
correlations between the various tests and the supervisor’s rat- 
ings on general efficiency. These ratings presumably con- 
sidered quantity, quality, consistency, and whatever other 
qualities make for effective workmanship, all lumped into a 
single rating. 

Contrary to the correlations obtained in Group I, we find 
here that the dexterity test correlates positively with all four 
criteria, even though it shows a negative correlation with experi- 
ence on the job. This has been pointed out already in connec- 
tion with our graphic analysis of the data of this test. The 
presence of this negative correlation with experience naturally 
raises this issue: The girls testing highest on the dexterity test 
are better operators (indicated by the positive correlations of 
dexterity with the ratings of efficiency and actual production) 
in spite of the fact that they have had less experience on the 
job (indicated by the negative correlation of dexterity and 
experience). What would the relation between dexterity and 
efficiency be if all the operators had had equal experience! 
This is the question that would be asked in the case of appli- 
cants or new employees, for in this case the experience of all 
would be, by definition, the same, namely, zero. By partial 
correlation it is possible to rule out the effect of experience and 
determine quantitatively the relation between dexterity and 
rated quality, rated quantity, and actual amount of production 
when experience is constant. 

This has been done in the second row of Table II.* We find 
that when experience is ruled out—.e., equated for all em- 
ployees—the correlations of dexterity with effective workman- 
ship are in two of the four cases higher than before, in one the 
same, and in one lower. All are still positive. The evidence 
seems conclusive that the dexterity test definitely tends to pick 


8 This ruling out of experience, by partial correlation was not done for 
Group I in Table I because experience did not correlate significantly with 
any of the test scores in this group. 
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the better operators, both in quantity and quality of produc- 
tion, among the employees in this group. 

The results on the intelligence test for this group are very 
nearly the same as those found for Group I. The correlations 
are -.12 with rated quality, +.22 with rated quantity, —.05 with 
actual amount of production, and +.01 with rated general effi- 
ciency. In other words, the higher scoring operators are pro- 
ducing less work, and work judged to be slightly lower in 
quality, than the lower scoring operators. Ruling out experi- 
ence (see the correlations in the next row) does not change 
this conclusion. 

The precision of hand movement test correlations are given 
in the next row. This test correlates positively with all four 
criteria, indicating that it may well be incorporated in a 
battery of tests for future hiring of operators in this group. 

Inspection of the next ten rows of Table II, which deal with 
the vision tests, substantiates rather clearly for this group the 
conclusions reached with Group I. The operators with nor- 
mal vision are generally rated better in quality of work but 
have a smaller output than the girls with defective eyes. This 
is shown by the fact that the correlations with rated quality 
(in the first column) are nearly all positive, whereas those 
with rated quantity and actual output (second and third col- 
umns) are frequently negative. The presence of numerous 
negative correlations in the column headed ‘‘ Rated General 
Efficiency’’ indicates that the supervisors making the ratings 
are influenced more by quantity than by quality in judging 
general efficiency. 

Other correlations included in this table which may be of 
interest, although they do not bear directly upon the test re- 
sults, show the relations existing between age, height, and 
weight and efficiency on the job. The correlations with age 
are low, inconsistent, and probably of no significance. The 
taller and heavier girls, however, definitely have higher pro- 
duction, both rated and actual, and the work of this latter 
group is also judged superior in quality. 
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Table III presents the coefficients of correlation between the 
test results and the criteria for the operators in Group III, 


TABLE III 


Correlations between Test Results and Pooled Ratings of Efficiency for 
42 Operators in Group III 





Pooled 
Pooled Amount of efficiency 
ratings of plant ex- ratings 
efficiency perience (experience 
constant ) 





r PE, + the r PE, 

Dexterity neue eee 20 .10 mt es 2 . 10 
Hand precision (composite 

score) 16 .10 GB «as 18 10 
Hand precision (error score 

only) i 09 .11 -.31 .09 23 .10 
Visual acuity J 16 14 -.43 .1 38 12 
Stereopsis 06 .14 -.18 .1 
Vertical imbalance ; (too few failures) 
Far point fusion 00 .14 08 .14 
Lateral imbal. (far) = 08 .16 24 15 

st ‘¢ (near) ites 01 .16 05 .16 
Near point fusion sae 09 .15 11 = .13 
Ametropia (near) 18 .13 07 .13 
Ametropia (far) : 08 .13 .06 .13 
Color vision 17 = .16 -—.24 .15 
Experience 37 = ©.09 





This group contains operators working on an assembly line, 
and therefore the only criteria of efficiency are the pooled 
ratings already mentioned. 

The dexterity test correlates +.20 with the pooled ratings of 
efficiency, —.11 with the amount of plant experience, and +.27 
with efficiency when experience is constant. In other words, 
the better operators tend to score higher on the dexterity test, 
although they have had less plant experience, as indicated by 
the negative correlation of —.11. 

The correlation of hand precision with efficiency is -.16; 
with experience, +.02; with efficiency (experience constant), 
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~18. These figures indicate that the hand precision test in 
this form is related negatively to efficiency on the job. This 
negative correlation between hand precision and efficiency is 
explained by the fact that production and errors were consid- 
ered of equal importance in scoring the test. The better 
operators are more careful. They have apparently acquired 
habits of precision which prevent them from achieving a very 
high seore on this test, since their major objective is to keep 
errors as low as possible. The plausibility of this explanation 
follows from the next row of this table, where the correlations 
between the error scores alone of the hand precision test and 
efficiency ratings are tabulated. Here the test correlates + .09 
with efficiency, —.31 with experience, and + .23 with efficiency 
when experience is ruled out. These results indicate that for 
hiring operators for assembly line work of this type, this test 
is of no value when a scoring method which considers produc- 
tion and errors of equal importance is used. A better way to 
score this test is simply to consider the errors—.e., the larger 
the number of errors, the poorer the score. When scored in 
this manner, the test correlates + .23 with rated efficiency, a 
value that is high enough to make the test contribute to the 
total battery in the hiring of operators. Perhaps a still better 
way of administering the test would be to have the operators 
punch in time with a metronome, so that production would be 
kept constant. We intend to follow this method in further 
experimentation with this test. 

The various tests in the visual battery correlate positively 
with rated efficiency, the only marked deviation being in the 
case of ametropia for near objects. In the second column, we 
observe that a number of the visual tests correlate negatively 
with experience, the values of the negative correlations being 
particularly great for visual acuity, stereopsis, and color vision. 
As the girls failing these tests were not on the average any 
older than those passing, it looks very much as if some aspect 
of the lighting conditions on the radio assembly line may be 
operating to reduce visual efficiency with increasing experience. 
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We propose to check this point by retesting these operators at 
an early date, in order to determine whether visual defects are 
increasing more rapidly than can be accounted for by increasing 
age. 

In the third column are the correlations between the various 
tests and the ratings on efficiency with the element of experi- 
ence ruled out by partial correlation. These correlations con- 
stitute the best evaluation of the tests, because if the battery 
is used in employing new operators, the plant experience of 
the applicants will all be the same, 1.e., zero. In this column 
we find that efficiency correlates + .27 with dexterity, + .23 with 
error scores on the hand precision test, + .38 with visual acuity, 
and +.29 with color vision. The importance of color vision 
for this particular job is probably due to the fact that in wiring 
a radio set rapid discrimination must be made between differ- 
ent colored wires. 

From a statistical viewpoint, four tests, each of which corre- 
lates as highly as these with a criterion, should in combination 
indicate a still higher degree of relationship. The multiple 
coefficient of correlation between these four and rated efficiency 
was found to be + .60. For individual prediction, a coefficient 
of correlation as low as + .60 means that the battery is of little 
value, but for picking operators who, on the average, will be 
superior to the general run, a correlation of this size is fairly 
satisfactory. 

The findings in general would seem to support the following 
conclusions : 

1. The finger dexterity test is of no value in picking opera- 
tors for Group I,° but tends definitely to pick operators above 
average in amount and quality of work in Groups II and III. 

2. In Groups I and II the hand precision test picks operators 
who are more efficient than the average, particularly in quality 
of work. When the error scores alone are used, this test is of 
value in picking operators for the assembly line in Group III. 

3. Intelligence, as measured by the Otis Advanced Test, is 


9 See page 241 for a description of the kind of work done by the opera- 
tors in the three groups. 
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of no value in picking operators for Groups I and II. If such 
a test is used at all, the applicants scoring lowest should be 
hired, as the findings indicate that the better operators are 
slightly less intelligent than the poorer operators. 

4. The vision tests pick operators who do a large amount of 
poor work, 1.¢., who produce more than the average, but do 
work of inferior quality. This is probably explained by the 
fact that the girls with defective vision have a lower standard 
for what they consider satisfactory work. 

5. In Group ITI, defects in vision are present in proportion 
to experience on the job. This may mean that the lighting 
conditions under which these girls are working are unsatis- 
factory. 

6. In Groups I and II, the taller, heavier girls are producing 
more than the shorter, lighter girls. 

7. In Group III, a combination of four tests—finger dexter- 
ity. hand precision, visual acuity, and color vision—yields a 
multiple correlation of +.60 with the pooled ratings of 
efficiency. 





FALSE IDENTIFICATION OF ADVERTISE- 
MENTS IN RECOGNITION TESTS 


D. B. LUCAS ann M. J. MURPHY 
New York University 


ALSE recognition or false identification of advertisements 
in the field testing of copy is generally acknowledged to 
be a vitiating factor in the scores obtained. Similarity 

of advertisements, usually in the same campaign, is likely to 
cause confusion. This would tend to increase the scores on 
those advertisements, as does any deliberate or ‘‘ unconscious” 
exaggeration on the part of the respondent. Just how much 
false identification comes into any uncontrolled technique is 
not known. The high prestige acquired by some of the memory 
techniques suggests that the scores have been reasonably accu- 
rate for most purposes. What seems most likely is that the 
suspected error is a variable one which may cause certain 
advertisements to gain or lose out of proportion to others. 

This article deals with the false identification of unpub- 
lished advertisements which the respondent could not have 
seen previous to the interview. As a result of more than 35) 
interviews, using 100 advertisements in each, it has been pos- 
sible to-get data on the general range of false identification 
previous to the actual circulation of advertisements. Each of 
the interview sets contained 100 mixed, loose advertisements. 
Fifty of the advertisements were from the current issue of a 
large, general magazine. The remaining fifty were taken from 
forthcoming issues of the same magazine by securing copies in 
advance of regular delivery. Readers of the current issue 
were found by making door-to-door calls. Each reader was 
asked to use his best judgment in identifying the advertise- 
ments he remembered having seen in the publication.’ 


1 For detailed description of method and procedure, see: D. B. Lucas, 
The Impression Values of Fixed Advertising Locations in the Saturday 
Evening Post, Journal of Applied Psychology, 1937, 21, pp. 613-631. 
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The individual advertisements were checked for appearance 
in other publications, and were eliminated if previously cir- 
culated elsewhere. All external clues, such as adjacent edi- 
torial matter and date lines were removed and the advertise- 
ments were kept face up when shown. The advertisements 
were not even shown in the order of their page numbers. Thus 
identification in each case depended entirely upon the impres- 
sion made by the advertisement alone. In this and other re- 
spects, this technique provides an opportunity for spurious 
scores Which may not be present in equal degree in other 
methods where different conditions are imposed. 

Following is a summary tabulation of all of the false identi- 
fications (Table 1). This includes all advance advertisements. 
There was some guessing on every advertisement in the test 
group, with the single exception of a Parke Davis page in 
black and white. 


The most common tendency, according to the column of 


TABLE 1 
Number of Advertisements Falsely Identified 





Description of advertisements 





; Per cent Full 
identifying Two Full page One One 
advertisement nad page | in black half quarter | Totals 
falsely pag inecolor| and page page 
white 


5 





0- 4.9 
5- 9.9 
10-14.9 
15-19.9 
20-24.9 
25-29.9 
30-34.9 
35-39.9 
40-44.9 
45-49.9 
50 and above .. 
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totals, is for 15 to 20 per cent of the respondents to identify 
advertisements which they have not actually seen. Even more 
impressive is the wide range of the errors (i.e., false identif. 
cations), with one fifth (19.86 per cent) of the advertisements 
receiving 30 per cent or more of wrong responses. Naturally, 
these highly confused advertisements tended in general to 
rank proportionately high later on when they were tested 
after regular circulation of the magazine. But, because of the 
wide variations in spurious scores, it is not practicable to use 
any average figure as a fixed correction for all advertisements 
Instead each advertisement has to be checked both before and 
after regular appearance in order to arrive at a true correction 

Following are over-all average false identification scores on 
each of the types of advertisement shown in Table 1. (These 
averages were computed before the data were grouped in the 
tabulated classification. ) 


Two Pages .......... iaitiiitdinmidateiaiainds a per cent 
Full Pages in Color pane sail 

Full Pages in Black and White ... a oe 
One-half Pages ................. sia Oe 


>? 9? 


One-quarter ‘Pages 


All of the data obtained on forthcoming advertisements 
tend to emphasize the necessity of treating each advertise- 
ment separately. While the average ‘‘false’’ scores on dif- 
ferent sizes of advertisement run quite close to one another, it 
is apparent that individual variations within each size are 
about equally great. The quarter-page advertisements seem 
a little less likely to cause guessing, but the real explanation 
for this point might well be found in the frequency with which 
they are scheduled or in interview cooperation, rather than in 
size as it actually affects readership. 

Important, from the standpoint of applying the recognition 
technique, is the performance of the individual respondents in 
the interview. In order to show the wide range of reader per- 
formance, the following table (Table 2) has been worked out on 
the basis of the difference between the per cent of correct and 
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the per cent of false identifications in each interview. It is 
assumed that a person who has ‘‘looked into’’ a magazine 
could identify more advertisements from that issue than from 
a forthcoming issue, which he could not have seen. The in- 
tensity and extensity of his reading activities are somewhat 
revealed by the difference between his ‘‘right’’ and ‘‘wrong’’ 
answers. 
TABLE 2 


Individual Interview Performance 





Diff. between per cent correct and per cent Number of 
false identifications respondents 


80-89 1 
70-79 1 
60-69 
50-59 
40-49 
30-39 
20-29 
10-19 

0-9 

-~ 9-0 
=» 49.30 
— 29-20 





Total 








If we deal strictly with individual interviews, any perform- 
ance at zero or below suggests at once that the respondent had 
not looked into the magazine. At least he has been peculiarly 
unaware of the majority of some fifty of the largest advertise- 
ments. One might also reason that the distribution of these 
doubtful readers would tend to place as many above zero as 
below it. Therefore, it is equally probable that 33 of those 
above zero as well as the 33 below had never looked into any 
advertising pages of the magazine. This reasoning would sug- 
gest that 66, or a total of 23.4 per cent of the people who said 
that they had seen the magazine might not have seen it at all. 
They might have been deliberately misleading the interviewer ! 
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The writer is not at all inclined to follow the above line of 
reasoning, except as it sets an upper limit to the amount of 
lying on unseen issues of the magazine in question. One 
might, after inspection of the distribution of scores on indi- 
vidual interview performance (Table 2), conclude that there 
had been no errors at all in identifying the magazine issue 
as a whole. The modal performance falls between 10 and 19 
per cent on the positive (or correct) side. Since there is a 
range reaching above 80 per cent correct, it would be statisti- 
cally unique to find that distribution abruptly terminated at 
zero (bearing in mind that the possible negative limit is minus 
100). Most of the 33 persons with negative scores are statisti- 
cally accounted for by the general form of this skewed distri- 
bution. 

Two other major factors make it doubtful that many of ihe 
respondents falsely identified the current magazine issue to 
the extent first suggested. One is the fact that there are often 
many cues of a more spectacular kind than the advertisements 
to help fix the issue in the reader’s mind. Our interviews 
did not deal in any way with these editorial elements. How- 
ever, in this particular magazine it would require devious 
scheming and forethought to enjoy any editorial features 
without having to pass over some advertising matter. The 
second of these general points is the possibility of motivation 
which would cause deliberate false identification of the issue. 
A respondent who might incline to identify too many adver- 
tisements in order to provide evidence of his literary accom- 
plishment would have less to gain by falsely identifying a par- 
ticular issue. Any field worker knows that there are times 
when people express pride and disdain by asserting that they 
do not read a particular publication. Thus again, it may be 
argued that there is little reason to expect a high percentage 
of false identification of whole magazines. 

In conclusion, following this rationalized discussion, it 
seems unsatisfactory to say merely that the tendency to iden- 
tify a magazine falsely is probably between zero and 23.4 per 
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cent. The data presented here represent only a preliminary 
step in finding the true answer to this secondary problem from 
an advertising standpoint. Research now in progress will 
eventually narrow the limits of false identification as set down 
above. 

CONCLUSIONS 

1. There is a high degree of false identification of advertise- 
ments when advertisements which have not before been pub- 
lished are shown people. 

2. This false identification varies with individual advertise- 
ments from a few per cent to fifty per cent or more. That is, 
some advertisements are falsely identified by over 50 per cent 
of the people and others by 5 per cent or less. 

3. For this reason, no general mathematical formula for the 
correction of false identification is possible. The false identi- 
fication for each individual advertisement must be determined 
separately if any correction is to be made. 

4. The technique used in our study may have tended to dis- 
courage or encourage false identification. In the present 
study the advertisements were shown in a binder, not in the 
form of a complete current magazine such as is shown to 
people in the usual commercial tests. This and other details 
of the technique may have encouraged or discouraged false 
identification. 

5. The results obtained here are naturally attributed to the 
method used in this study. However, it seems permissible to 
conclude, on the basis of the present study, that the recogni- 
tion and identification methods, no matter how used, will need 
some correction for the varying degrees in which different 
advertisements tend to produce false identification. In other 
words, one method may produce more or less false identifica- 
tion than another, but the differences as between individual 
advertisements will still remain. This is due to the fact that 
false identification is a function not of the individual’s mem- 


ory, but to the differing character of the advertisements 
themselves. 
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been an entirely new problem, brought about by the re- 

cent demand for efficient personnel to distribute unem- 
ployment relief funds. There were no tests on the market to 
measure either aptitude of a novice for the work or know- 
ledge and abilities of the experienced or trained worker. The 
following experiment was an attempt to measure some of the 
abilities which are needed by the relief visitor, with a view to 
selecting a battery of tests which could be used in predicting 
successful work, and for determining what qualities should be 
considered in building an aptitude test for this kind of 
employment. 


B ices an G tests for relief workers and social workers has 


PROCEDURE 


1. Subjects: The subjects were 61 visitors? of the Penusy!- 
vania State Emergency Relief Administration with a mean 
age of 29.25 years and a range from 23 to 50 years. Twenty- 
eight were men and 33 were women. In college education 
they had from zero to four years and a mean of 2.9 years. 
Seven had some professional training beyond college gradua- 
tion and two had attended social service schools. Liberal arts, 


1The authors wish to acknowledge their indebtedness to Dr. ©. H. 
Smeltzer, of Temple University, for his many helpful comments and 
criticisms during the process of this study. 

2The visitor (sometimes called case worker, investigator, or aide) 
establishes and maintains home contacts with the relief recipient, deter- 
mines eligibility for relief, and carries the primary responsibility for 
making the relief plan with the family and for adjusting relief grants 
in case of changes in the circumstances of the recipient. 
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teaching, and technical or scientific courses were the most com- 
mon fields of specialization (35 per cent, 29 per cent, and 15 
per cent, respectively). Seven per cent of these visitors had 
had no work experience prior to their present employment and 
15 per cent had had some experience in social work, ranging in 
time from one to eight years. The remainder of the group 
had been employed in teaching, nursing, or business. They 
had been employed as relief visitors in this organization for a 
mean of 15 months. As measured by all the above items this 
group was a true sample of the entire visiting staff throughout 
the state. 

2. Choice of tests: An analysis* of the visitor’s duties 
showed that some degree of intelligence, not only in the ab- 
stract sense of reasoning ability or problem solving, but also 
in dealing with social situations, would be needed by the visi- 
tor. Also, her own personality was an important factor in the 
performance of her job. An elementary knowledge of eco- 
nomics might be very useful in judging the material resources 
of relief applicants. General information or school achieve- 
ment was added to this list of possible factors and the follow- 
ing tests were chosen: 


a. Ohio State University Psychological Test. Form 18— 
for general intelligence. 

b. Social Intelligence Test by F. A. Moss, T. Hunt, and 
K. T. Omwake. First edition, revised form—for judg- 
ment in social situations. 

Cooperative General Culture Test by the American 
Council on Education. Form 1935—for ability in 
school subjects. 

’The analysis, which included nine major divisions of work and ap- 
proximately 200 specific factors, was based on Pennsylvania’s relief 
manual of procedure, which outlines the definite objectives of a visitor’s 
work and on the description used for determination of job classi- 
fications and salary scales. These items were discussed thoroughly and 
revised by a committee of supervisors and administrators in the organi- 
zation who were representative of the entire staff and completely familiar 


with visitors and their work, and were then arranged in order of impor- 
tance. 





72 BOYD R. SHEDDAN AND LOUISE R. WITMER 


d. Cooperative Economies Test by the American Council on 
Education. Provisional form, 1935 (Part II only)— 
for knowledge of economics. ; 

e. Thurstone Personality Schedule. 1929 edition—for per- 
sonality adjustment. 

Besides these standard tests, one other, designed by the Per- 
sonnel Standards Section of this organization and called a 
Relief Attitudes Scale, was included because it filled a need 
not met by any known tests. Its function is to measure that 
part of a visitor’s job which is concerned with intelligent and 
unprejudiced appreciation of the relief client’s personal prob- 
lems and traits. 

3. Measurement of actual performance (criterion score) : 
Here, again, the above-mentioned job analysis played a major 
role. Obviously, the complexity of the job duties precluded a 
complete measure of performance by objective testing. It was 
decided therefore to use as many procedures as necessary to 
arrive at a dependable result. 

As a first step, all elements which could be measured reli- 
ably by objective testing methods were isolated. This group 
included all the rules and procedures of the organization 
which pertained to the visitor’s work. A test on knowledge of 
this material (to be referred to, in the future, as the ‘‘ Test of 
Technical Information’’) had a coefficient of reliability of 
.92 + .001.. The remainder of the evaluation was made by the 
ranking method (called ‘‘Merit Ranking’’). In order to ac- 
complish this all those persons who were familiar with the 
visiting job and with the employees performing it worked to- 
gether to achieve a list of visitors ranked in order of personal 
competence in performing the job. The coefficient of cor- 
relation between the test scores and the merit rankings was 
-.06, which indicated that the two measures represented dif- 
ferent aspects of the visitors’ abilities. 


TEST RESULTS 


1. Distribution of scores: Table 1 shows the medians, means, 
and standard deviations of the scores of the subjects for each 
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of the standard tests. In the first column, for comparative 
purposes, are medians taken from the normative materials for 
each test. Thus it may be seen at a glance that the experi- 
mental group were higher than college freshmen on the Ohio 
State University Psychological Test. In other words, seventy 
per cent of this group exceed the median for college freshmen 
in general intelligence. 


TABLE 1 


Distribution of Scores on the Standard Tests 


Name of Experimental Group 
. Test a ee —_— = 


? - (norm) ¢4 a. * Standard 
(abbrev.) ) Median Mean Dasiatinn 


Ohio State 70 90.5 87.1 26.3 
General Culture 163 112.5 131.4 84.5 
Economics 15.5 16.8 8.0 
Personality 67 66.1 64.6 19.2 


Social Intel. .» 117.4 19.3 





Likewise, 77 per cent exceed the median of upper class college 
students on the Social Intelligence Test. However, they fall 
helow the average on the Cooperative General Culture Test, 
on which the median score for college juniors is exceeded by 
only 28 per cent of this group. On the Thurstone Personality 
Schedule the median of the visitors is equal to the median for 
college freshmen as presented by the authors of the test. 

2. Correlations between the various measures: In order to 
determine the relationship between the standard tests and the 
criterion scores the coefficients of correlation were computed 
for each measure with every other measure by the product- 
moment method (Pearson). The two parts of the evaluation, 
(a) the Test of Technical Information, and (b) the Merit 
Ranking by supervisors and administrators were considered 

4Norms for the Ohio State University Psychological Test and the 


Thurstone Personality Scheduie are based on scores of college freshmen, 


all others are in terms of juniors or seniors. There are no norms avail- 


able for Part II alone of the Cooperative Economics Test. 
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both separately and together. When combined, the two parts 
received equal weight and were called the ‘‘ Composite Score.”’ 

It may be seen from Table 2 that the Test of Technical In- 
formation (column one under Criterion) has its highest cor- 
relations with the two intelligence tests, namely, .60 with the 
Ohio State University Psychological Test and .58 with the 


TABLE 2 


Correlation Coefficients of the Various Tests 





Criterion Standard Tests 





Name of 
Test 
(abbrev.) 


nomics 


ra 
y 
ou 


Eeo- 








Ohio State 

Gen. Culture 
Economics 2 ‘ 2s F 39 
Personality of . ; ; -.13 | .00 
Social Intel. d ; ‘ : 46 | 40 
Relief Att. .0% F j -.01 10 | .21 40 





























Social Intelligence Test. Next in order are the correlations of 
.35 with the Cooperative General Culture Test, .28 with the 
Cooperative Economics Test, and .03 with the Relief Attitude 
Seale. The only negative correlation is one of —.26 with the 
Thurstone Personality Inventory. This last test correlates 
-.18 with the Merit Ranking (column two under Criterion). 
The only high correlation in this column is .20 between the 
Relief Attitudes Scale and the Merit Ranking. Using the 
Composite Score, the correlations are found to be lower but to 
follow approximately the same order as those of the Test of 
Technical Information. The Ohio State Test and the Social 
Intelligence Test are first with correlations of .37 and .30 re- 
spectively. The General Culture Test correlates .18 with the 

5In order to arrive at a composite score the test scores were ranked, 
and the two sets of ranks were given equal weights and translated into 


linear scores on a 100-point scale. (See Clark L. Hull, Aptitude Testing, 
World Book Company, 1928, pp. 386-393.) 





TESTS FOR RELIEF VISITORS 275 


Composite Score, the Relief Attitudes Scale .23, and the Eco- 
nomies Test .23. There is again a negative correlation of — .26 
for the Thurstone Personality Inventory. 

It appears, therefore, that the Test of Technical Informa- 
tion is, in some degree, a measure of intelligence and general 
information® while the merit rank includes neither of these 
items in any reliable amount, but does tie in with the Relief 
Attitudes Seale. It is possible that, with a larger sampling 
and a wider range in both these items, some correlation would 
appear. However, as has already been pointed out, the sub- 
jects of this experiment are quite representative of the entire 
croup with which the problem of tests for visitors is con- 
cerned, and this precludes the expectation of any great 
variations in results. 

The relationship between the criterion, the tests, and age, 
education, and experience in the position of visitor is shown 
in Table 3. Immediately it will be seen that these cannot be 


TABLE 3 


Correlation Coefficients of Tests with Other Measures 





Criterion Standard Tests 








Eco- 
nomics 





Age 


! 
p 
rs 


Education 
Length of 
service .03 . i . —Al 


te 
QD 





























correctly interpreted without a knowledge of the correlations 
between these other variables. Computation gives the follow- 
ing correlations: 


6 The reader will remember that tests of achievement generally show, 
provided they are used in fields where variations can be measvred, a high 
correlation with intelligence tests, as for example, the Ohio Siste Test 
and the General Culture Test in this study. 





BOYD R. SHEDDAN AND LOUISE R. WITMER 


Age with education 35 

Age with length of service | + .25 

Edueation with length of service — .50 

For the problem under consideration, the correlations be- 
tween these three variables and the criterion are of most in- 
terest. Inspection of the table shows that they are all low: 
with age .06, education .12, and length of service —.15. 
Further comment about the individual correlations will add 
very little to the problem at hand, but a brief explanatory 
statement is in order. The correlations presented in the fore- 
going table are dependent upon certain factors which are in- 
herent in the history of the Relief Administration. In short, 
when the organization first came into being there was little or 
no time for a systematic employment of staff. There was work 
to be done and jobs had to be filled immediately. Further- 
more, the employment thus offered had little to recommend it- 
self in the way of either security, salary, or interest. Con- 
sequently, the staff of those early days included all ranges of 
ability and background. As measured by present personnel 
standards the employees were often below rather than above 
the desired level of either capacity or performance. However, 
as personnel procedures were established, standards for the 
employment and retention of staff were raised and in the 
visiting staff emphasis was placed upon the selection of young 
workers with two or four years of college education. For 
these reasons there is some correlation between age and length 
of service, a negative correlation between age and education, 
and a large negative correlation between education and length 
of service. 
CHOOSING THE FINAL BATTERY 


1. Weighting and combining the tests: Inspection of the 
coefficients of correlation in Table 2 shows immediately that 
no one of the tests by itself will serve as an accurate prediction 
of the composite score on the criterion measures. Conse- 
quently, certain ones of the tests were combined according to 
their correlations with the criterion and with each other. It 





- In- 
Ort, 
e or 
vork 
her- 
d it- 
)0n- 
s of 
nnel 
bove 
‘ver, 
the 
the 
yung 
For 
neth 
tion, 
ngth 


- the 
that 
tion 
ynse- 
1g t0 


It 


TESTS FOR RELIEF VISITORS 


will be noted in this connection that ir many cases the ecor- 
relations between the standard tests are much higher than 
those between the tests and the composite criterion score. 
Several sets of weights for various combinations of the stand- 
ard tests were computed by means of the multiple-regression 
equation, and the forecasting power of each combination or 
battery was found by determining the coefficient of multiple 
correlation (R) by formula. Table 4 shows the resulting R’s 
and the corresponding forecasting efficiency values. Battery 
number one shows the highest correlation but is only slightly 


TABLE 4 


Correlation Yields of Various Combinations of the Standard Tests 


Test Tests used in Battery 


= Corre- Forecast. 
Battery Ohio Gen. — - Social Rel, ation Efficiency 
No. State Cult. nomics son. Intel. Att. (R) (E) 


x x 33% 
31% 
17% 
15% 
5% 
4% 


a) 


x 


[sao an 
or bo 


w 
Oo Ww 


ne 


above the second battery which contains one less test. In 
number two neither the Genera! Culture Test nor the Eco- 
nomics Test appears. These both had low correlations with 
the criterion and the Economics Test was abbreviated (only 
Part II was used). A comparison of battery number five 
with the Ohio State, General Culture, Personality, and Social 
Intelligence Tests against number six (omitting the General 
Culture Test) does not add enough to R to justify the extra 
work involved in using it. A comparison of one and two shows 
the same for the Economics Test. When batteries four and 
three are studied together, the Personality Schedule is found 
to be about equal in importance to the Social Intelligence 
Test. Batteries four and six show the greater power of the 
Relief Attitudes Scale as compared to the Personality Sched- 
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ule. When used together with the Social Intelligence Test 
and the Ohio State Test (which were common to both four 
and six) R goes up to .72, which is next to the highest found.’ 
Inasmuch as battery two*® is judged to be the most usefy! 
the prediction formula for those tests alone is given below: 
X, =—.186X; —.578X, + 2.873X, + .829X, — 94.678 
in which 3 is the Ohio State Test, 6 the Personality Schedule, 
8 the Relief Attitudes Scale, and 7 the Social Intelligence 
Test. The formula may be corrected for dispersion in the 
usual fashion, if desired, by multiplying the weights by .72. 


DISCUSSION 


The comparatively simple methodology of this investigation 
shows that written tests may be used in measuring auilities of 
relief visitors. Such a conclusion was not readily acceptable 
to the social work profession in the early days of relief pro- 
grams because of the great emphasis placed upon personal 
qualifications by the nature of the duties of the job. Any per- 
son familiar with social work personnel can quote innumerable 
instances in which employees with the requisite information 
are unable to accomplish satisfactory job performance because 
of personal factors, which are less easily measured. On the 
other hand, the sudden growth of pressure for civil service 
methods of personnel administration for these positions de- 
mands immediate intensive study of all conceivable appoint- 
ment procedures. The supply of individuals experienced in 
social work methods is not nearly great enough to answer the 
demand. Hence, some method other than investigation of the 
experience record must be found. The study here presented 
is a sample step in the problem. The procedure described is 

7It is worthy of note that the Personality Schedule is important in 
spite of its negative correlation with the criterion measures (Thurstone 
reports higher scholarship for mal-adjusted students—see Instructicns 
for using the Personality Schedule) and the Social Intelligence Test also 
carries weight although it has a fairly high correlation with other tests 


used. 
8 The standard error of estimate of this battery is 12.7 points. 
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me. 


no different from that used in other employment fields and it 
is indicated that such procedures are also applicable here. 
Further study may thus follow the usual processes and it is 
hoped that other investigators may add their efforts to the 
construction of valid and reliable tests for relief visitors. 


CONCLUSIONS 

A battery of tests consisting of the Ohio State University 
Psychological Test, the Moss Social Intelligence Test, the 
Thurstone Personality Schedule, and a Relief Attitudes Scale 
was found to have a correlation of .72 with the job efficiency of 
61 visitors as measured by their scores on a Test of Technical 
Information about their duties and by a Merit Ranking of 
their personal performance based upon the ratings of their 
supervisors. This correlation is higher than that usually 
found in studies of this sort. 

With this information at hand it should be possible to build 
a single test covering the various factors found in the total test 
battery. 





THE RELATION BETWEEN VOCATIONAL 
INTERESTS OF MEN IN COLLEGE AND 
THEIR SUBSEQUENT OCCUPATIONAL 
HISTORIES FOR TEN YEARS 
DOROTHY T. DYER 
Bucknell University 


N the spring of 1935 the writer carried to completion the 
last five years follow-up of the 101 cases included in a 
study begun in 1924 at the University of Kansas.’ The 

purpose of the present study was to analyze the permanence 
of the vocational interests of this group of college men ten 
years after graduation in relation to: (1) the time the choice 
was made; (2) the chief origin of choice; and (3) ratings by 
Strong’s Vocational Interest Blank. 

The following questions were included in a' letter to each 
man in the original group of 101: 

1. What occupation or occupations have you been engaged 

in since 1930? How long did you remain in each! 


Have you had any periods of unemployment? How 
long? 

Would the offer of $100,000 lead you to change the na- 
ture of your occupation? If so, what would you change 
to? 


Eighty-nine responses were received, many of which gave 
more information than asked for in the questions. Published 
poems, research publications, a great deal of interesting family 
history, some tragedy as well as success were reported. One 
man has died, and the other eleven who did not respond seem 
to have disappeared completely, although every effort has 
been made to reach them. 


1A thesis by John R. Dyer submitted to the Department of Psychol 
ogy and the Graduate School of the University of Kansas in partial 
fulfillment of the requirements for the degree of Master of Arts, April 
16, 1930, and published in the Journal of Applied Psychology, 1932, 16, 
233-240, under the title: ‘‘Sources and Permanence of the Vocational 
Interests of College Men—101 Cases over a Five Year Period.’’ 
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VOCATIONAL INTERESTS OF MEN 


Strong’s Vocational Interest Blank was sent to each of the 
eighty-nine who responded to the first letter asking them to 
take the test. Sixty-two of the tests were completed, returned, 
and scored, and a report of the results sent to each man. The 
other twenty-seven failed to respond after three requests, sev- 
eral indicating unwillingness to ‘‘expose their weaknesses. ’’ 

The most interesting part of the study is contained in the 
101 life histories included in the report which is in the library 
of the University of Minnesota.2 There are stories of voca- 
tional changes due to many causes, from physical handicaps, 
financial difficulties, and family pressure, to a search for some- 
thing more in line with abilities and interests. However, the 
great majority of histories indicate that the vocational choice 
made in the senior year of college held through at least ten 
years of occupational history, as Table 1 shows. 


TABLE 1 


Number and Per Cent of Cases in Which College Major Held 
for Five and Ten Years 





Number Per cent 


Major held 5 years - 69 78 
Major did not hold 5 years 5 20 








Total number of cases 89 


Major held 10 years ; 64 
Major did not hold 10 years 25 


~ 


Total number of cases 89 100 


Three persons whose major did not hold for five years re- 
turned to the original major field somewhere between the five 
and ten year period. Since it was not continuous for the full 
ten years period they are listed under those whose major did 

? Dyer, Dorothy T., The Relation between Vocational Interests of Men 
in College and Their Subsequent Occupational Histories for Ten Years. 


M.A. thesis prepared under the direction of Professor Donald G. Pater- 
son on file in the University of Minnesota Library. June, 1937. 250 pp. 
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not hold for ten years. Case 43 majored in physics but went 
into chemical research for five years. At the end of ten years 
he had returned to physics. Cases 49 and 53 changed from 
business back to law at the end of ten years which was their 
original major in college. 

The evidence in these 89 cases indicates that college work 
does prepare for the vocation followed for ten years after 
graduation in more than 70 per cent of the cases. Here jg 
impressive evidence that a college education and all that that 
term implies is truly a preparation for life, at least in the area 
of vocational adjustment. 

The 89 cases are divided as follows in relation to the major 
field of study in college: 

Business Eaters 27 
Engineering ; SECO weber 14 
Law oie caperoiaatani 12 
Journalism ............ meee 12 
Medicine , daa 6 
Education ncacpatel ice 
Scattering = ae 14 


Total PE OE, 


In the original study* it was noted that ‘‘no significant dif- 
ference in the vocational holding power of the various voca- 
tions appeared at the end of five years.’’ At the end of ten 
years, however, the proportion of cases majoring in journalism 
and education that did not hold is conspicuously large. 
Three-fourths of those majoring in journalism (9 out of 12), 
and three-fourths of those majoring in education (3 out of 
4) had changed to totally different fields. Since the number 
of cases is small, possibly no great significance should be 
attached to the holding power of these vocations as the cause 
for change. It is possible, of course, that the depression may 
have influenced these professions more than some others. 

The stability of occupational choice is best expressed in the 
four summary tables 2, 3, 4 and 5. 


8 Dyer, John R., Sources and Permanence of Vocational Interests of 
College Men. Jour. Appl. Psych. 1932, 16, 238. 
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TABLE 2 


Permanence of First and Other Choices 





Vocation first Vocation fol- Vocation fol- 
followed-on lowed at end lowed at end 
leaving college of 5 years of 10 years 





> > 
Per Number Per Number Pe ! 
cent cent cent 


1st college choice 74 83 71 79 65 73 
2nd college choice 10 10 11 10 11 
3rd college choice 3 3 é 2 2 


Number 


4th college choice 0 
Choice not previ- 
ously mentioned 2 2 2 13 


Total 89 99 9§ ¢ 99 


At the end of ten years 65 were engaged in choice no. 1 
made as college seniors; 10 in choice no. 2; 2 in choice no. 3; 
and 12 at vocations not listed as choices ten years earlier. As 
mentioned before, three of the 65 engaged at choice no. 1 had 
made changes in vocations during the ten years but had come 
back to their first choice. Both the numbers actually entering 
upon their first choices and the numbers engaged in them ten 


TABLE 3 


Permanence of Interest with Reference to Time of Vocational Choice 





Vocational choice Vocational choice 
held 5 years held 10 years 





Number Number Percent Number Per cent 

Choice made before high 

school 28 25 89 
Choice made during high 

school rn 19 76 
Choice made during college 72 
Choice made at miscellane- 

ous times 


Total 
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years later indicate a high degree of permanence in vocational 
interest as expressed at the time of college graduation. 

At the end of five years 89 per cent of those choices which 
had been made before entrance to high school still held, and at 
the end of ten years 78 per cent held. On the other hand, 72 
per cent of the choices made during college held for five years 
but only 59 per cent held for ten years. These figures show 
(1) that choices made before and during high school held 
rather consistently for ten years, and (2) that choices made 
during college tend to be somewhat less stable than those made 
earlier. 

TABLE 4 


Permanence with Reference to Chief Source of Choice 





Vocational choice Vocational choice 
held 5 years held 10 years 


Source of choice Number Number Per cent Number Per cent 
Family situation 26 23 88 23 88 
Boyhood occupation ~ 83 75 
Class room inspiration 11 82 72 
Hobbies 90 80 
Counsel of friends 67 
Origin not clear 





Miscellaneous origin 


Total 





In 26 cases the family situation or tradition seemed to be the 
chief source of choice. These cases followed a father’s or a 
close relative’s profession or business; often with a partner- 
ship as an incentive. Of these 26 cases, 23 or 88 per cent 
remained permanent for ten years. The size and permanence 
of this group make it significant. In 12 cases the boyhood 
occupation was reported as the basis for choice, and 9 of the 12 
or 75 per cent were still engaged in the vocation at the end of 
10 years. Eleven cases based their decision on the influence 
of teachers or of a course of study in high school or college, and 
8 or 72 per cent remained with the decision for ten years. Of 
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the 10 who based their decision on a hobby, 8 or 80 per cent 
were following their chosen vocation at the end of ten years. 
Of the 9 who were guided by the counsel of friends, teachers, 
or relatives, 6 or 67 per cent remained true to it for ten years. 

The two groups who got their vocational beginnings in boy- 
hood occupation or hobby are significant, both from the stand- 
point of permanence and from the fact that the source of the 
decision may have involved more of the individual’s choice, 
free from other pressures. 

The two groups whose chief origin of choice showed the least 
holding power were those of ‘‘origin not clear,’’ and of ‘‘ mis- 
cellaneous origin.’’ At the end of five years 8 or 61 per cent 
of the 13 whose ‘‘origin of choice was not clear,’’ held, but at 
the end of ten years, only 5 of 13 or 38 per cent still held. Of 
the group ‘‘miscellaneous origin,’’ 6 of the 8 or 75 per cent 
held for 5 years but only 3 of the 8 or 37 per cent held for ten 
years. 

Permanence with reference to measured interests as scored 
by Strong’s Vocational Interest Ratings is summarized for 61‘ 
of the 89 cases of this study in Table 5. 


TABLE 5 
Permanence with Reference to Measured Interest (Strong’s Ratings) 





Vocation of second Vocation at end of 

and third choices 10 years not in list 

followed at end as choice at college 
of 10 years graduation 


Strong’s Vocation of first 
letter choice followed at 
ratings end of 10 years 





Number Percent Number Percent Number Per cent 
19 41.0 67.0 2 40.0 
1; 25.0 11.0 00.0 
B 17.0 11.0 40.0 
8.5 11.0 00.0 
C 8.5 00.0 20.0 


Total 100.0 100.0 5 100.0 





‘Case 73 omitted from this table because it was impossible to score 
‘‘gambling house proprietor’’ on Strong’s test. It reduces the number 
of scored cases from 62 to 61. 
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At the end of 10 years, 39 of the 47 cases, or 83 per cent of 
those following their first choice of occupation for ten years, 
rated A, B+, or B, while 8 or 17 per cent received ratings of 
B-— and C. Of those who followed their second and third 
choice for 10 years 8 of the 9 or 89 per cent were rated A, B+, 
and B while only 1 rated B-. Of those who followed a ‘‘ choice 
not made at the time of college graduation’’ at the end of 10 
years, 4 of the 5 or 80 per cent made ratings of A or B, and 1 
or 20 per cent rated C. 

These figures show that individuals who exhibit a high de- 
gree of permanence of vocational choice tend to receive A and 
B ratings on the appropriate scoring key in Strong’s test. 

The question, ‘‘ Would the offer of $100,000 lead you to 
change the nature of your occupation,’’ was asked to discover, 
if possible, how much real satisfaction and success the vocation 
itself had for a man, apart from financial pressure. 

At the time of the original interview in college, 16 of the 89 
eases would have changed their vocational choice if they had 
been financially able to do so, while 27 out of 89 would have 
changed with financial independence ten years later. Seven 
of the cases indicated changes each time but only three would 
have made the same change each time. One other indicated 
changes related to his original choice and three others named 
unrelated choices. Case 15 continued in his first choice, that 
of business, throughout 10 years, but each time indicated that 
if he were financially able he would write. Strong’s test rated 
him A in business and also A as editor. Case 40 indicated 
twice, writing, as his $100,000 choice, though he had followed 
the vocation of electrical engineer, his first choice for ten years 
in which he rated a C, while he received an A rating as an 
editor. Case 9 followed choice no. 2 as chemist for ten years, 
but indicated a change to teaching chemistry both at the time 
of the original interview and ten years later. As a chemist 
he rated A and as engineer a B, while as teacher he rated B. 

In the case of the 18 who are dissatisfied and would change 
with an offer of financial independence, the Strong’s test rat- 
ings indicate that about half could reasonably expect voca- 
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tional satisfaction as far as measured interests are concerned 
in the occupation in which they are engaged. Changes to the 
$100,000 choice would lower the chances of satisfaction as 
judged by interest ratings on Strong’s test. If the number 
of eases were larger, great significance might well be attached 
to this evidence, which would indicate that subjective choices 
made free from financial pressure are not necessarily well 
founded choices when measured objectively for interest, and 
that some factors other than interest of equal or greater impor- 
tance enter into vocational dissatisfaction. 

In analyzing the results for the group from whom no data 
are available for the last five years of the study, it was found 
that 8 of the 12 cases or two-thirds had an occupational history 
of stability for the first five years out of college. Again the 
family situation and boyhood occupation were responsible for 
the chief source of the choice. Those whose choices were made 
before high school were more permanent for the first five years. 
The summary of first and other choices indicates a 100 per cent 
permanency for the first college choice for the five year period. 
Three of this group expressed a desire to change their vocation 
with the offer of $100,000 at the time of the original interview. 
It is impossible to tell what happened to 11 of this group be- 
tween the years 1930-1935. Case 46 died in 1932. The rest 
simply disappeared though an exhaustive effort was made to 
contact them. Surprisingly enough, these ‘‘lost cases’’ do not 
appear to be different from the rest of the group as far as data 
at the end of 5 years out of college are concerned. This fact 
suggests that if they were victims of the depression then the 
depression itself operated on the ‘‘stable’’ and ‘‘unstable’’ 
with equal foree. 


CONCLUSIONS 


1. There is a high degree of stability in vocational interests 
as expressed in the college years and followed in later life. 

2. Vocational decisions of college graduates made early in 
life have the greatest holding power. Those who make deci- 
sions in high school and college change more frequently. Col- 
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lege training offers greater variety of choice, but it also forces 
vocational choice, if one has not been made earlier, to conform 
with the requirements of specialization in college. This may 
or may not result in a wise choice. 

3. Decisions made in line with the family tradition, boy- 
hood occupations, and hobby have a surprisingly high degree 
of permanence. 

4. Stability or permanence in vocational choices expressed 
by college seniors is affected by interest and may be measured 
with considerable success. The use of measured interests as 
one instrument of guidance would seem to be highly desirable. 





A COMPARISON OF COOPERATIVE TEST 
SCORES AND HIGH SCHOOL GRADES 
AS MEASURES FOR PREDICTING 
ACHIEVEMENT IN COLLEGE 


JOSEPH V. HANNA 
Washington Square College, New York University 


N February, 1935, a series of Cooperative Tests was admin- 
istered to freshmen entering Washington Square College, 
New York University. This program with certain addi- 

tions and deletions was repeated the following September, and 
for several of the entering freshmen groups thereafter. The 
purpose of the present paper is to report relationships between 
test scores and scholastic attainments for students in the Col- 
lege, and to present interrelationships among test scores, 
grades in the different college subjects, and grades in the dif- 
ferent high school subjects. 

Seores made by two groups of entering freshmen,—those 
entering in February, 1935, and in September, 1935, enter into 
the study here reported. The larger (September) group was 
sampled by taking every other student from a list arranged 
alphabetically. Roughly, the sample of the September group 
consisted of approximately 500 students, whereas the February 
group was slightly larger. The smaller number reported as 
having taken any one test is due to the fact that not all students 
took every test in the battery, due to absence, ete. There 
would seem little question, however, of the adequacy of the 
samplings for certain of the tests,—especially English, mathe- 
matics and French. Among the tests taken by both groups of 
entering students were the following: Cooperative English 
Test, Cooperative General Mathematics Test, Cooperative 
French Test, Cooperative German Test, and Cooperative 
Spanish Test. All these tests were published by the Coopera- 
tive Test Service of the American Council on Education. 

The correlations offered in the several tables represent rela- 
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tionships between scores made on each of the several tests and 
performance in the corresponding subjects in Washington 
Square College, between test scores and grades made in the 
several high school subjects as represented by the official grades 
certified by the different high schools, between grades in each 
high school subject and each corresponding college subject, and 
a series of intercorrelations among these several scores and 
grades. Usually the tables include correlations between gen- 
eral averages as weil as between each separate score and grade. 
Correlations between high school grades and Cooperative Test 
Scores, however, were computed for the September group only 

Criteria of attainment against which test scores were corre- 
lated are based on one term’s work (one-half year) for the 
February group, and on one year’s work for the September 
group. All college grades were translated into numerical 
equivalents for purposes of the comparison. Likewise, nu- 
merical equivalents of the high school grades as certified by 
the different high schools, were used. All the work done by the 


TABLE 1 
A Comparison of High School Grades with College Grades; and of 


Cooperative Test Scores with College Grades for 
Entering Freshmen, February, 1935 





High school grades Cooperative test scores 
with college grades with college grades 





No. students} r |P.E|No. students r P.E. 





English 416 49 | .02 | 376 

Mathematics 319 35 | .03 | 193 

French : 138 54 | .04 | 83 (3)* 

42 (2-24)* 

German 76 55 | .05 41 

Spanish ~ 68 49 | .06 43 (2-23)* 

General Average 
(Eng., math., 
foreign languages) 402 51 | .03 | 380 




















* Numbers in parentheses indicate number of years students had 
studied the language in high school. 
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student in any subject in his several years of high school resj- 
dence, or in his one or two terms residence in college was repre- 
sented by a single numerical score. For example, if the 
student had taken three years of mathematics in high school. 
or had taken college mathematics throughout two terms, the 
average of all separate grades in mathematics was taken as an 
index of his ability in mathematics within high school, or the 
College. The same procedure was followed with respect t 
each subject. In each case, therefore, the single Cooperative 
Test Score,—English, general mathematics, and each of the 
languages—French, German, Spanish,—enters into compari- 
son with a single numerical grade for performance in the corre. 
sponding subject in high school or college. 

Data for the February group are presented in Table 1, and 
for the September group in Table 2. For purposes of com- 
parison most of the data included in Tables 1 and 2, in 
rearranged form, are presented in Table 3. Correlations rep- 
resenting comparison of high school English grades, and Co- 
operative English Test scores respectively with college achieve- 
ment in English, show general similarity for the two groups of 
entering freshmen. Results are all but identical for the tw 
groups, and show that four years of high school achievement 
in high school English is approximately the same in prognostic 
value as is the single score on the Cooperative English Test, 
Series I, which requires ninety-five minutes for administration 

Comparison of high school mathematics grades with college 
grades in mathematics shows rather low correlations for both 
of the student groups. Scores on the Cooperative General 
Mathematics Test, however, show as satisfactory correlation 
with college achievement in mathematics for both groups of 
students as would have been expected. High school grades 
in French are considerably more valid for the February group 
than for the September group in relation to college achieve- 
ment in French. Cooperative French Test scores, however, 
show no such difference for the two groups of students. In 
general it would seem that Cooperative French Test scores are 
somewhat more valid than high school grades, in relation to 





“UOTPVUIMIEXG [vosopoyoAsg [LounoD uvdseuy—oryuedsed Zurpnyouy , 


ose 3s" le €0° | TS" 30F (sesunsuvl ustei0J 
“qyeu “Sug ) 
esvlVAY [e10UI4) 
3 ‘| (€3-2) eF ; 89 ystuedg 
8F , I 30° l uvUldes) 
™ (#3-3) SF 

ade (¢) ¢8 
30 €6I | €0" 122 ceo" | se" 61 SOTyBUIIyIe 
L623 918 | €0° 19 cz" | 6F OIF ystsug 
Pee syuepnys |, syuepnys 
dd zequin yy atequin 


co" 99T FO" 8éI qouely 








oe sjuepnys |. syuepnys 
dd Jaquin Nn i d; 4 Jequinn 


























dno cesT dnoi3 ce6t dnoi3 cegtT dnoiz ce6T 
‘requieydag ‘£rVN1qG9,7 ‘raquieydag ‘Aren1g9,q 











ie 
o 
es 
3 
fo) 
=) 
4 
& 
2, 
fa] 
| 
Sy 
=| 
| 
S) 
< 








saps OFo][0d 4}IM 891008 489} OATZBIOdO0D sopvid oFe][00 YIM Sopvsd [OOYoS STH 











Sjuapnig fo sdnoip omy sof ‘sappiy a6ayj0g yun 
sa400g 48a aaivsadoog fo pun fsappsy abayog ypm sapvsy jooyog ybiyT fo uosiuvdwog y 


€ WTAViL 





; 2 _— aa - ae a eee al 
“4 . ‘ ’ ran + = e > ——" = om = t — «2 be Lew | & Pre 
2 4 2 2 . : cc - = = 1 -— = — es + = s os = > L = § a> = 
’ Dew 2@Ayesee 2g . @ S¢+ £&€ & So a So ke & 7. > Se Ss =e =: FF os 
@eS es eS — os a & = & >O Ff ~~ S&S Ye? & F&  . oo q 
gG & S i “3 cs k eo ¢ . mm o PD = SE ns eoSBA8 Besa om ,_g 8 
= Ge n ~~ = a Ca oo Bo Fg a oa = ™ De &e wo BV 
a + o seme OS Bw wa = 2 o - De - o-oo ee ee ¢ SY 6. @©) —= we c — 7 








ystuvdg 
YIM “WVVPY 
UBULIOL) 
qIre “Hey 
youel,T 
GPT “Gye 
“Gey 
qa | ystsugq 
eh ~~ ystaedg 
qyIa «ys Zug 
SP SP UBULIOL) 
qua = ys sug 
69T 68T 613 me" GoueLy 
Wa = ystsugq 


HANNA 


JOSEPH V. 








—_ s}uopnys sjuopn4s mer s}u0pnys 
ad ioquin Ny roquin N ad requin N syoofqng 


SS ee 


891009 489], oAT}eIBdO0D SepBIH 239][0H sepBlnH jooyog ys 



































SS6T ‘saquajdag 
‘uawysaig Gursajug ‘sasog 4897, aatypsadoog pun sapvsy 2621100 ‘sappsy yooyog ybiyy Guowp 8u01Y)94100492UT 
AI @TaVaL 





ACHIEVEMENT IN COLLEGE 295 


achievement in college French. Due to the small numbers of 
students in German and Spanish, the relationships would seem 
not to be sufficiently reliable for significant generalization. 

Intercorrelations among high school grades, college grades 
and test scores for the September group are presented in Table 
4. One significant difference would seem worthy of comment. 
The general intercorrelations among high school grades and 
college grades, with one or two exceptions, are approximately 
the same in size. Languages and mathematics, for example, 
being about as closely interrelated as languages and English. 
It will be observed, however, that intercorrelations among co- 
operative test scores, with the exception of English and French 
are generally lower than corresponding correlations among 
achievement grades. This is especially true with respect to 
intercorrelations between mathematics and French, German 
and Spanish. This observation is substantiated by similar 
intereorrelations among Cooperative Test Scores for the Feb- 
ruary group which are not offered in tabular form since inter- 
correlations among high school and college grades were not 
computed for this group. The correlations among cooperative 
test scores, for the February group, however, are as follows: 
English with Spanish, r=.69; English with French, r=.56; 
English with German, r=.32; English with mathematics, 
r=.43; mathematics with Spanish, r=.28; mathematics with 
German, r=.19; mathematics with French, r=.18. Probable 
errors are about the same as for similar correlations for the 
September group. Only one correlation,—mathematics with 
Spanish, differs significantly from the corresponding one for 
the September group. 

Several questions arise as to the interpretation of these dif- 
ferences. Why, for example, should mathematics and any one 
of the foreign languages have more in common on the basis of 
high school and college grades than on the basis of the more 
objective Cooperative test scores? The writer shall not at- 
tempt to answer this question. There are, however, one or two 
theoretically logical considerations. It would seem likely that 
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a grade in mathematics or a foreign language might represent 
more than mere achievement in the subject itself. The way 
the student impresses his instructor,—his habits of work, his 
general demeanor, his personality, might influence the in. 
structor as he submits his final mark. Such general factors, 
if present, would help to explain how a student might get 
about the same grade in mathematics as in a foreign language. 
whereas he might get essentially different scores in the two 
subjects, on the basis of tests based solely on more specific 
subject-content, and where such personal factors cannot enter 
There is the assumption that high school and college grades are 
based to an extent on factors which cannot be represented 
adequately in an objective test. If this were true, then it 
would be expected that high school grades would be more valid 
in predicting college achievement, than scores made on strictly 
objective test scores. This inference has little support on the 
basis of findings of the present study. For the groups of stv- 
dents included here, the Cooperative Mathematics Test is more 
valid than high school grades in mathematics, in predicting 
achievement in college mathematics. In general, it can be 
inferred that the cooperative test scores in the languages are 
as valid as high school grades in predicting achievement in the 
languages in college,—in so far as generalizations can be made 
on such small samplings of students. 


SUMMARY AND CONCLUSIONS 


For the groups of students studied, Cooperative English 
Test scores were found to be approximately equivalent to 
grades covering four years’ experience in high school English, 
in predicting achievement in college English. Cooperative 
General Mathematics Test scores proved to be a decidedly 
better basis for predicting achievement in college mathematics 
than did high school grades in mathematics. Cooperative 
French Test scores showed a somewhat closer relationship with 
achievement in college French, as compared with high school 
grades in French. Equivocal results with respect to similar 
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scores on the Cooperative German and Cooperative Spanish 
tests may be attributed to the small samplings of scores and 
grades in these subjects. Reliable generalizations for these 
comparisons cannot be made. 

Intercorrelations among cooperative test scores, high school 
grades, and college grades, with the possible exception of En- 
glish with French, show the Cooperative Test scores to be more 
unique as compared with high school and college grades. It is 
likely that the greater objectivity of the cooperative tests 
contributes in no small measure to these results. 





AN EXPERIMENT IN MEASURING THE 
EFFECT OF GROUP INSTRUCTION 
FOR FOSTER PARENTS 
ALICE LEAHY SHEA anp HELEN RUTH HERTZ 
University of Minnesota 


ECENT years have seen a tremendous increase in parent 
education through group instruction. As a corollary 
of this, foster-parent education has been attempted in 

many cities. In Minneapolis a program of foster-parent edu- 
cation was begun in 1929, at which time the Foster Home 
Council, an informal organization of child-placing agencies, 
sponsored a series of lectures on child care and training.’ 
Since that time a series of five or six lectures covering a wide 
range of topics have been given annually. Problems of disci- 
pline, of health, of sex education and of character building 
were among the topics discussed. Speakers in the main have 
been professional persons—clergymen, doctors, social workers 
and psychiatrists. 

In addition to the educational function of the lectures, the 
Foster Home Council planned to make each lecture a social 
occasion for the mothers. A general reception with refresh- 
ments followed each lecture. Meeting other persons who were 
doing the same kind of work, visiting informally with execv- 
tives of agencies and representatives of agencies other than 
the one for whom the mothers boarded children, all gave prom- 
ise of increasing the respective mother’s sense of the social 
importance of her work. Further, it was believed that in this 
social exchange the mothers would learn that the problems of 
child management are universal and that there is no road by 
which they may be quickly eliminated. Although complete 
records were not kept, it is estimated that from fifty to one 

1 The agencies composing the Foster Home Council are: Catholic Wel 
fare Association, Children’s Protective Society, Hennepin County Child 


Welfare Board, Jewish Family Welfare, Lutheran Children’s Friend 
Society, Lutheran Welfare Society, and Washburn Home. 
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hundred fifty mothers attended each meeting. A large num- 
ber of persons have therefore participated in the program 
since its inception. 

The present study is an attempt to evaluate by means of an 
attitude test the educational function of the lectures as pre- 
viously deseribed. The real evidence of success would un- 
doubtedly le in the extent to which the mothers’ practices in 
child eare and training have been improved. The difficulties 
of evaluating practice are obvious. Only by means of con- 
trolled observations before and after the pursuit of a course 
of instruction in child care could one hope to make a reliable 
evaluation of such instruction. But since performance and 
attitudes have been found to be positively correlated, it seems 
entirely proper to undertake an evaluation of attitudes in ref- 
erence to practice situations. 

The attitude scale* constructed for the purpose of our evalu- 
ation is composed of thirty statements of what to do in certain 
child care and training situations. Habits, emotions, sex 
behavior, discipline, physical health, mental health and parent- 
child relationships constitute the areas included in the test or 
questionnaire. To each statement a respondent is asked to 
indicate his judgment by underlining one of the following 
words: strongly disagree, disagree, doubtful, agree, strongly 
agree. What constituted the correct response had been pre- 
viously determined by seven experts representing the fields 
of psychology, child welfare and social work. Weights rang- 
ing from 5 to 1 had also been assigned to each response level. 
The correct or perfect response thus receives a weight of 5; 
the least correct response, or the wrong response, is given a 
weight or score of 1. The correct response to the first state- 
ment, which reads, ‘‘The purpose of discipline is to bring 
about strict obedience to the person who is administering it’’ 
is “‘strongly disagree’’ which receives a score of 5. If a re- 
spondent underlines ‘‘strongly agree’’ his score on the state- 
ment would be 1. The other possibilities, namely, ‘‘disagree, 
doubtful, agree’’ would be scored 4, 3 and 2, respectively. 

? Reproduced in Appendix A. 
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The reliability of the scale was established by the correla- 
tion between the odd and even statements composing the scale. 
When corrected by the Spearman-Brown formula, this corre- 
lation was .870, which is in keeping with attitude scales of this 
type. In order to establish the reliability of the individual 
statements the scores made by the highest and lowest quartiles 
were compared. Using 2.5 as the criterion for judging the 
significance of the difference between these two quartiles, al] 
but five items, namely, statements no. 8, 11, 20, 21 and 26 
differentiated the extreme quartile groups. 

The test was first administered to a group of fifty-nine fos- 
ter mothers who were in attendance at one of the lectures 
given in May, 1937. This group was then matched with a 
group of foster-mothers who had not attended any of the lec- 
tures. Matching was done on three indices, namely, educa- 
tional attainment, occupation of husband, and sociality, as 
determined by the Leahy Measurement of Urban Home En- 
vironment.‘ The matching yielded fifty pairs and was calcu- 
lated to within plus or minus 1 sigma for each of the afore- 
mentioned indices. 


The similarity of the two groups of mothers relative to these 
indices is best shown in Table 1. Under the classification of 


TABLE 1 


Comparison of Means and Standard Deviations of Experimental and 
Control Groups of Mothers for Sigma Scores on Edu- 
cation, Occupation and Sociality 





Experimental group Control group 
N M 8.D. N M 8.D. 





Variable 





Education 50 = —.232 842 50 = — .180 .786 


Occupation 50 .537 | 50 042 = .636 











Sociality |- 50 612 1.023 50 





3 Appendix 1. 
Leahy, Alice M.: The Measurement of Urban Home Environment. 
Minneapolis: University of Minnesota Press, 1936. 
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‘‘Experimental’’ are listed the scores of the mothers who had 
attended the lectures. The scores of the mothers who had not 
attended any lectures are reported under the caption ‘‘ Control 
Group.”’ 

Since zero is the mean sigma score for the generality of a 
metropolitan population, according to the Leahy Seale, we see 
that our foster-parents tend to be slightly below the average 
in education and occupation, but somewhat above the average 
in sociality. The formal schooling of the foster-parents is 
about eighth grade, their mean occupational level would be 
classified in the skilled trades group. Their positive deviation 
from the average in sociality may reflect a capacity for group 
adjustment and a basis of selection as foster-parents. 

The analysis of our foster-parents’ attitudes toward child 
training situations is presented in Table 2. An inspection of 
this table shows that the experimental group attained a slight 
but insignificantly higher score than did our control group. 


TABLE 2 


Comparison of the Scores Obtained on the Questionnaire by 
Experimental and Control Groups 


Group N M SD. M-M, SE, OR." 








Experimental ; 50 113.0 14.95 1.1 2.48 44 
Control a 50 111.9 12.80 








* C.R. was obtained by the following formula: 
M,-M, 
S.E.y,? + 8.E.4,?- 2Pxy8.E.y, 8.E.u, 





Although the foregoing result does not disprove the general 
merit of special lectures on child care and training, it raises 
serious question as to the educational value of programs whose 
lectures cover a diversity of topics. While it is possible to 
ascribe the small difference in the scores of these two groups 
of mothers to the factor of administration of the questionnaire, 
on the assumption that individual administration is produc- 
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tive of better results than group administration, it appears 
highly improbable that these results can be entirely assigned 
to this factor. One should have expected the experimental 
group to surpass the control group. The question of diversity 
of program may hence be seriously raised. If the perform. 
ance of the control group can be assigned in part to individual 
instruction of mothers by the social workers who personally 
supervise the child caring services of all the mothers working 
for child placing agencies, then it would appear that the 
educational program as pursued in Minneapolis is a duplica- 
tion of service. 


MINNESOTA SURVEY OF FOSTER PARENTS’ 
OPINIONS OF CHILD CONDUCT 


Name 
Address 
Social Agency . 


In the last three years have you attended any of the lectures for boarding 
mothers? Do you usually attend all of them? Less 
than three a year? 

Has your social worker mentioned these meetings? Has she 
urged your attendance? 


DIRECTIONS: Following you will find statements about children, their 
training, and specific situations. With some of these statements you will 
agree, with some you will disagree, and about others you will be doubt- 
ful. If you ara very sure of what you think, you will agree strongly or 
disagree strongly. This is an illustration: 


If a child refuses to go to school, he should be allowed to stay home. 


Strongly disagree Disagree Doubtful Agree Strongly agree 





Underline the phrase which best describes your attitude. 


1. The purpose of discipline is to bring about strict obedience to the 
person who is administering it. 


Strongly disagree Disagree Doubtful Agree Strongly agree 


2. To praise a child is generally a more effective means than to scold 
him. 
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Strongly disagree Disagree Doubtful Agree Strongly agree 


To give the child a feeling of security is one of the most important 
functions of the foster home. 


Strongly disagree Disagree Doubtful Agree Strongly agree 


. If a boy of ten wets his bed often, he should be severely punished. 


Strongly disagree Disagree Doubtful Agree Strongly agree 


5. Children under ten years of age should never be given an allowance 


because they have no conception of the value of money. 


Strongly disagree Disagree Doubtful Agree Strongly agree 


. A child should be taught to be dependent on his parents in all 


matters. 


Strongly disagree Disagree Doubtful Agree Strongly agree 


. If a small boy uses swear words frequently, that is a sign that he 


was born bad. 


Strongly disagree Disagree Doubtful Agree Strongly agree 


8. If a child does not like going to bed, he can be encouraged to find it 


enjoyable by telling him stories or singing with him. 


Strongly disagree Disagree Doubtful Agree Strongly agree 


9. When a child of ten sucks his thumb, you should try to make him 


feel ashamed of himself. 


Strongly disagree Disagree Doubtful Agree Strongly agree 


). A child’s manners and eating habits should not be a subject of con- 


versation at table when the other members of the family are present. 


Strongly disagree Disagree Doubtful Agree Strongly agree 


. If a girl of ten insists on sleeping in the same room as her parents, 


it is a matter of extremely great importance. 


Strongly disagree Disagree Doubtful Agree Strongly agree 


. If a child often disturbs others with boisterous, disorderly conduct, 


he should be made to stay in a closet for several hours in order to 
teach him to be quiet. 


Strongly disagree Disagree Doubtful Agree Strongly agree 


. Children should be told a few minutes before bedtime so as to give 
them sufficient time to put away their playthings. 
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Strongly disagree Disagree Doubtful Agree Strongly agree 


Cod liver oil should only be given to young children when they are 
sickly. 


Strongly disagree Disagree Doubtful Agree Strongly agree 
The Schick test determines a child’s susceptibility to smallpox. 
Strongly disagree Disagree Doubtful Agree Strongly agree , 
If one child in a family is less quick to learn than another, it will 
always spur the slow child on if you constantly point out the 
superiority of the other. 

Strongly disagree Disagree Doubtful Agree Strongly agree 


If a three year old boy tells wild stories which are obviously untrue, 
you should punish him for lying. 


Strongly disagree Disagree Doubtful Agree Strongly agree 


If a child stutters, making fun of him is the best way to rid him of 
the habit. 


Strongly disagree Disagree Doubtful Agree Strongly agree 


If a child is afraid of dogs, you should tell him there is nothing to 
be afraid of and then drop the subject. 


Strongly disagree Disagree Doubtful Agree Strongly agree 


The growing child should be gradually educated to the use of danger 
ous things such as matches or scissors. 


Strongly disagree Disagree Doubtful Agree Strongly agree 


The child needs more food in proportion to his size than does an 
adult. 


Strongly disagree Disagree Doubtful Agree Strongly agree 


If you find a child masturbating, you should punish him severely so 
that he will know how wrong it is. 


Strongly disagree Disagree Doubtful Agree Strongly agree 


If a child has had a fear of water since babyhood, it is probable 
that he was born with it. 


Strongly disagree Disagree Doubtful Agree Strongly agree 
Children should be trained to sleep while the ordinary noises of the 


household are going on, rather than to keep the home absolutely 
quiet. 
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Strongly disagree Disagree Doubtful Agree Strongly agree 


Since children are not born with a knowledge of the difference 
between a truth and a lie, we must train them to realize what truth is. 


Strongly disagree Disagree Doubtful Agree Strongly agree 


The best way to remove fears in a child is to appeal to his will- 
power. 


Strongly disagree Disagree Doubtful Agree Strongly agree 


If a child is suspicious and inclined to distrust people, he should be 


encouraged because nobody will ever be able to make a fool out of 
him. 


Strongly disagree Disagree Doubtful Agree Strongly agree 


The foster home should give the child freedom to grow according to 
his own capacities. 


Strongly disagree Disagree Doubtful Agree Strongly agree 


The relationship of the foster mother to her charges should be 
entirely different from that of the natural mother to her children. 


Strongly disagree Disagree Doubtful Agree Strongly agree 


30. Since the aim of placement is the best welfare of the child, the 


foster mother can regard herself as working with the agency toward 
this goal. 


Strongly disagree Disagree Doubtful Agree Strongly agree 





NOTES 


NOTE ON THE VALIDITY OF THE AMERICAN 
COUNCIL ON EDUCATION PSYCHO- 
LOGICAL EXAMINATION 


EDWARD E. CURETON 


Cooperative Test Service* 


HE Psychological Examination of the American Council on Educa- 

tion (commonly called the ACE Test), though it is very widely 

used as a college entrance test, has often been criticized. Its 
critics usually claim that it is too much of a speed test, and that it is 
too heavily weighted with numerical and spatial materials. The writer 
has often heard the statement made that the opposites test in particular 
(commonly regarded as the best test in the battery on no specific 
grounds), would have a higher validity than the total battery if it were 
lengthened sufficiently and applied without time limit. 

Through the courtesy of Professor L. L. Thurstone and the American 
Council on Education, the writer was permitted to reproduce the ACE 
Opposites Test for six consecutive years (1927 through 1932). It was 
mimeographed on two sheets, three of the original 27-item tests on each 
sheet. The first sheet was given to five classes in general and educational 
psychology and tests and measurements on a Friday. The second sheet 
was given to some groups Monday, some Wednesday, and some Friday of 
the following week. The students were permitted to use the whole 50 
minute period to complete each set of 81 items if necessary. Every stu- 
dent finished in each case before the end of the period. There were 190 
students who completed both sheets. Of these, 137 had taken one of the 
later editions of the ACE Test when they entered college, and had also 
remained in school at least one and one-half additional years unless they 
were graduated before that time. 

All ACE Test scores were transmuted, by means of the percentile 
norms, into the corresponding raw scores on the 1928 edition of the test. 
The scores on the two sheets of opposites tests were added to give a total 
score on the 162-item test. An unweighted average of the four or more 
weighted semester-average grades was taken as the index of scholarship. 
The three intercorrelations, based on these 137 cases, were: 

IIE oo iccccicccnnnccctcnenioions ae | 
ACE-scholarship oii bien aiisigbiaanaianapee = mn 
opposites-scholarship vse) eens ' . 433 


* Professor of Education, Alabama Polytechnic Institute, on leave. 
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The difference between the second and third correlations is .071; and the 
standard error of this difference is .056. The difference is not statisti- 
cally significant, but it is at least suggestive. At any rate it lends no 
support to the claims noted above. 

The failure of the opposites test to exhibit a higher validity than the 
total ACE Test is not due to its having been insufficiently lengthened. 
If this were the case its reliability would also be low. The intercorrela- 
tions among all six of the original 27-item tests (three per sheet) were 
computed using 187 of the original 190 cases. Three papers were acci- 
dentally mislaid. Six of these correlations were between pairs of tests 
from the same sheet. The average of these six, raised by the Spearman- 
Brown formula to give an estimate of the reliability of the total test, 
was .942, This coefficient is roughly comparable to odd-even reliability 
coefficients. It compares favorably with coefficients of this type for the 
total ACE Test as commonly reported. Nine of the correlations were 
between one test from one sheet and one from the other. The average of 
these nine, raised again by the Spearman-Brown formula to give an esti- 
mate of reliability for a test six times as long as one of the 27-item tests, 
was .857. This coefficient is comparable to reliability coefficients based 
on two forms of a test given a few days apart. 





A COMMENT ON ROBERT L. THORNDIKE’S 
“NOTE ON LQ. CHANGES IN FOSTER 
HOME CHILDREN,* BY EMMETT 
L. SCHOTT’’ 


FLORENCE L. GOODENOUGH 
Institute of Child Welfare, University of Minnesota 


N the December, 1938, issue of this Journal, Dr. Robert L. Thorndike 
calls attention to a statistical error in a previous article by Schott.* 
In a study of the mental test standing of 74 children before and 
after placement in foster homes, Schott found a median increase of 5,8 
I.Q. points or a mean increase of approximately 4 points. Schott states 
that this difference is only 0.28 times its standard error and is therefor 
statistically insignificant. Thorndike re-caleulated this value from the 
figures given by Schott and obtained a critical ratio of 4.6. He therefor 
states without further comment, ‘‘The conclusion of Schott’s article 
should read, ‘After an average length of residence in a foster home of 
about one year, our group show a statistically reliable gain in I.Q.’ ’’ 
It is greatly to be regretted that errors in computation should so fre 
quently appear in research publications, with consequent danger of gross 
distortion of the obtained results. It is even more regrettabie that in 
quoting portions of the work of other investigators, facts are often lifted 
out of their context with even more misleading results than are likely to 
occur from the computational errors previously mentioned. Of this, the 
present instance is a striking example. There is nothing in Thorndike’s 
note that indicates the existence of qualifying factors which would in 
any way modify the direct implication that this ‘‘statistically signifi 
cant’’ gain might be attributed to conditions other than the period of 
residence in the foster home. Yet Schott has been at considerable pains 
to point out that the initial tests were given at a time of particular emo 
tional stress for the subjects. They were commonly administered when 
the child was first turned over to the agency immediately after the break- 
ing up of the home through the death of a parent or for other reasons. 
At the time of testing the child had been newly placed in a temporary 
boarding home or hospital pending decision as to final disposition. 
Although, as Schott is careful to point out, there is no objective way of 
knowing the extent to which these adverse emotional conditions may have 
operated to lower the standing on the first test, no one with even an ele: 


*Emmett L. Schott, ‘‘1.Q. Changes in Foster Home Children.’’ 
Journal of Applied Psychology, 1937, 21, 107-112. 


308 





yrndike 
chott.* 
re and 
of 5.8 
; states 
erefore 
om the 
erefore 
article 
ome of 
oP * 

so fre- 
yf gross 
that in 
n lifted 
ikely to 
his, the 
rmndike’s 
rould in 
signifi 
sriod of 
le pains 
lar emo- 
ed when 
e break- 
reasons. 
mporary 
position. 


» way of 
nay have 
n an ele- 


ildren.’’ 


NOTE ON I.Q. CHANGES 309 


mentary understanding of child behavior would be inclined to ignore such 
a possibility. Schott’s opinion that the second test, given approximately 
a year after placement was a more accurate measure of true ability than 
the first would almost certainly be shared by the majority of competent 
individuals. The fact that a third test given to 19 of the subjects 
yielded a mean 1.Q. identical with that obtained on their second test 
lends further support to this hypothesis. It should also be noted that 
some effect of previous practice, even after a year’s interval, has been 
commonly found on a second administration of an intelligence test to 
young children for whom no radical change in environment has taken 
place. 

30th statistical accuracy and psychological insight are matters of 
prime importance in research. ‘‘Statistical significance’’ may be quite 
unenlightening in the absence of data that enable the reader to judge 
its psychological significance. It may even be highly misleading if, as 
in the present instance, facts necessary for a fair interpretation of the 
findings are ignored or suppressed. Regrettable as the computational 
error in Schott’s article unquestionably is, I am of the opinion that 
Thorndike ’s proposed correction of Schott’s conclusion, in which but one 
of the two known variants is mentioned, may lead to a far greater error 
in interpretation on the part of those who read only his brief note. I 
therefore propose that this correction be subjected to further amendment 
as follows: ‘‘ After an average length of residence in a foster home of 
about one year, the children of our group showed a statistically reliable 
gain in I.Q. as compared to their standing on an initial test given at a 
time of considerable emotional stress immediately after the breaking up 
of their own homes. The data do not enable us to determine whether the 
apparent gain resulted from the more stimulating environment of the 
foster homes or from a temporary depression of the initial ratings as a 


result of the unfavorable conditions incident to the administration of the 
first test.’’ 





NEWS AND NOTES 


The following items are taken from The Psychology of Marketing 
Bulletin No. 12, issued by the Psychological Corporation, New York City: 

‘*Probably the greatest national figure among psychologists, in fact, 
one of the greatest figures in any field today, is Dr. George Gallup. The 
story of his life and his development of the Institute of Public Opinion 
appeared in The Saturday Evening Post of January 18. Every psycholo- 
gist should read it. Dr. Gallup did his Doctor’s thesis under Dr. Seashore 
and Dr. F. B. Knight, at the University of Iowa. We understand there 
was some question in regard to his thesis because it was based on a 
sampling of consumers in respect to some article like shaving soaps. 
There was a question as to whether this was really psychology. Today 
Dr. Gallup stands in the forefront of those who measure the minds of 
people through scientific sampling by means of personal interviews. His 
Institute of Public Opinion has been called the outstanding contribution 
to social psychology in this century. Members of the Psychological Cor- 
poration are especially proud because his work has helped to vindicate 
the studies of public opinion, buying habits, etce., which they have been 
developing since 1931. . . . Incidentally, Dr. Gallup is the head of con- 
sumer research for the Young & Rubicam advertising agency. He con- 
ducts his Institute of Public Opinion during his spare time. 

‘-Dr. Frederic B. Knight, under whose immediate direction Dr. Gallup 
wrote his Ph.D. thesis, is head of the division of education and applied 
psychology at Purdue University. He tells us that they give Ph.D. degrees 
in psychology entirely in the field of applied. Dr. Knight, with Dr. Joseph 
Tiffin, head of the psychological group, was visiting in New York in con- 
nection with their device for photographing the movements of the eye in 
looking at advertisements. Several organizations are interested in the 
possibilities of this device in testing advertising. 

‘*The newspapers recently announced another landmark in the growth 
of applied psychology. Dr. Donald A. Laird, for many years head of the 
Department of Psychology at Colgate University, was appointed head of 
a new foundation for consumer analysis by one of the largest advertising 
agencies in the country, N. W. Ayer & Sons. ... Dr. Laird says: ‘I am 
trying my level best to get something started here which will not only be 
a good credit to psychology, but which may also in the course of time open 
up further opportunities along similar lines for other men in the field. 
That is one of my reasons for making haste slowly.’ 

‘*Dr. Otto L. Tinklepaugh was appointed head of market and consumer 
research by J. M. Mathes, one of the leading advertising agencies in the 
country. This appointment was made some months ago and continues the 
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pleasant relationship which the Psychological Corporation has enjoyed with 
Dr. Tinklepaugh for several years.’’ 


Occupational Index, prepared and distributed by the National Occupa- 
tional Conference, New York, announces as its new Advisory Board the 
following: John W. Studebaker, U. 8S. Commissioner of Education; Alex- 
ander J. Stoddard, Superintendent of Schools, Denver, Colorado; and Carl 
Milam, Secretary of the American Library Association. The Indez is a 
continuous bibliography of books, pamphlets, and periodical references 
containing information helpful to young persons in choosing an occupa- 
tion. Persons interested in such material may obtain a free sample copy 
on request. 


The Department of Psychology of the University of Chicago will offer 
in the summer quarter program a course on Physiological Psychology by 
Chester W. Darrow, of the Institute for Juvenile Research. Dr. Louis L. 
Thurstone will present a course on Factor Analysis dealing with methods 
for isolating mental abilities. Test Theories and Theory of Statistics are 
courses to be offered by M. W. Richardson and Harold O. Gulliksen re- 
spectively. Dael L. Wolfe will deal with the sensory processes in a course 
on Experimental Psychology; Forrest A. Kingsbury will offer two courses: 
one on Tests and the other on the History of Psychology. 


The World Book Company has recently issued Test Service Bulletin No. 
39, containing a digest of an address by Dr. John L. Stenquist, Director of 
Research in the Baltimore Public Schools, which was first published in the 
Baltimore Bulletin of Education, March-April, 1938. Dr. Stenquist re- 
views the beginnings and development of mental testing in the Baltimore 
Schools. Two and a half million tests have been given in the schools in 
this city since mental testing was introduced as a part of the regular pro- 
gram in 1922. ‘‘If I were asked what single item of greatest importance 
has resulted from our work with mental measurements,’’ states Dr. Sten- 
quist, ‘I would answer, ‘The discovery’ of individual differences. ’’ 


A pleasant way of taking college extension courses is now available on 
the 8.8. Rotterdam, Rio-bound on its 53-day cruise in connection with the 
Eighth Biennial Congress of the World Federation of Education Associa- 
tions from August 6 to 11. Three such courses will be offered as follows: 
Geography of South America and Ceography of Caribbean America by 
Clark University, and Comparative Education will be offered by Dean 
Henry Lester Smith of Indiana University. Inquiries regarding the 
credits available for these courses and other information concerning the 
cruise should be addressed to the World Federation of Education Asso- 
ciations headquarters, 1201 16th Street, N.W., Washington, D. C. 
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The Fifth Conference on Education and the Exceptional Child wil] he 
held under the auspices of the Child Research Clinie of The Woods Schools 
on April 25, at Langhorne, Pa. The morning session will include addregges 
by Dr. Charlotte Easby Grave, Consulting Psychologist, The Woods 
Schools, on ‘‘ Twenty-five Years of Progress in Education at The Woods 
Schools’’; Dr. Frank Astor, Liaison Office, National Child Welfare Asso 
ciation, New York, ‘‘ Application of Mental Hygiene Principles to the 
Classroom.’’ Dr. Garry Cleveland Myers, Western Reserve University, 
Cleveland, will speak on ‘‘ Emotions as Allies or Enemies in Learning,” 
The afternoon session will be devoted to two addresses, the first by Dr, 
C. E. Benson, Director of the Psycho-Educational Clinic, New York Uni. 
versity, on ‘‘ Mental Hygiene and the Exceptional Child,’’ the second }y 
Dr. Sidonie Matsner Gruenberg, Director, Child Study Association of 
America, New York, on ‘‘ The Development of Child Study.’’ 


The Third Conference of Teachers of Industrial Relations will be held 
on May 4, 5 and 6, 1939, at the University of Michigan, Ann Arbor, 
Michigan. Two sessions of the Conference will be devoted to discussions 
of the objectives and content of courses pertaining to Industrial Relations, 
Further details concerning the Conference may be obtained from John VW. 
Riegel, Director, University of Michigan. 


The preliminary program of the 63rd Annual Convention of the Amer. 
ican Association on Mental Deficiency has been received. This Convention 
will be held at the Palmer House, Chicago, Illinois, May 3-6th. Further 
information may be obtained from Dr. Neil A. Dayton, President of the 
Association, State Department of Mental Diseases, Room 701, 100 Nashua 
Street, Boston, Mass. 


A short summer course on vocational guidance sponsored by the National 
Institute of Industrial Psychology in Great Britain will be held in London 
from August 9 to 19 inclusive. It is intended primarily for those who 
act or intend to act as careers masters and careers mistresses in public 
and other secondary schools in England or Scotland but the course will 
be of interest to counselors and vocational guidance workers in the United 
States. It will consist of lectures, discussions and practical work. The 
fee will be five guineas. This is the fifth annual summer vacation course. 

During the same period, August 9 to 19, the Institute will also sponsor 
a course on the administration of the new English version of the Terman- 
Merrill revision of the Binet-Simon tests of intelligence. The fee for the 
course will be three guineas. These courses will be in charge of Ale 
Rodger, Director of Vocational Guidance for the N. I. I. P. 
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BOOK REVIEWS 
J. F. Duties. War, Peace, and Change. New York: Harper & Brothers, 
1939. 

This recent book by an expert in the field of international relations 
deserves careful consideration from psychologists, because the author has 
presented a tentative approach to the problem of war and peace which 
is partially psychological in nature and which, for once, contains few 
ideas to which a scientifically trained psychologist could offer serious 
objection. 

The author was a representative of the United States Government at 
the Versailles negotiations in 1920, serving on the reparations commission. 
He has also had abundant other experience which has acquainted him 
with the problems of international conflicts. 

The book is begun with a chapter dealing with conflicts between indi- 
viduals, and points out that there are necessarily going to be conflicts 
between those persons who have material possessions sufficient to satisfy 
them, and individuals who are not satisfied with their situation. He calls 
this the conflict between static and dynamic influences in the population. 
In other words, in any society we shall find a certain number of individuals, 
or a group, which seeks to maintain the status quo because they are, by and 
large, satisfied with existing conditions. Another group will be composed 
of those persons who are dynamic; that is, who are striving to change con- 
ditions in order to obtain a greater satisfaction for themselves. 

Under primitive conditions, the conflict between these persons was un- 
doubtedly resolved on the basis of force. Human progress, however, has 
been made possible by the elimination of the use of force in resolving these 
personal conflicts. The methods used in eliminating violence can be classi- 
fied as (1) the ethical solution, and (2) the political solution. The ethical 
solution provides that we educate people to give up their own good for 
others, to distribute their wealth (taxes, etc.) to make sacrifices, and so on. 
The political solution requires arbitration, the use of the courts, orderly 
procedure, ete. 

The ethical solution is probably impractical under conditions of extreme 
deprivation; the political solution impossible, if any very large element 
of the people refuse to go along. 

When we attempt to transfer these two types of conflict solutions to 
the problems of international relations, we find that the ethical solution 
is not applicable at present, because we have developed a point of view 
in which nations and other corporate bodies are soulless entities having 
no obligation to look out for the welfare of other people. This doctrine 
of the soulless character of corporations is, of course, one which is generally 
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recognized at law. Many legal decisions specifically provide that a cor. 
poration has no right to consider the welfare of people other than its own 
stockholders. In the same way, the national government is deemed to haye 
the duty to promote only the welfare of its own constituents. 

The political solution t' the problem of international violence is not 
applicable because: 

1. Treaties have been conceived as fixed and permanent. This does not 
allow for the development of dynamic groups or dynamic nations. 
There is, at the present time, no authority which is recognized as 
being above that of the nation (in other words, the problem of goy- 
ereignty). 

National governments have found it convenient to maintain internal 
solidarity and keep a particular group of politicians in power by 
calling attention to the danger of external enemies. 

As world society has become more complicated and as economic rela- 
tionships between people living in different nations have become more 
difficult, the balance between static and dynamic impulses is harder 
to evaluate and to maintain. 

The author then turns his attention to the tendencies which he calls those 
of totalitarian war. Totalitarian war, he points out, is something modern. 
Wars in previous centuries were fought mostly by professional armies and 
might make relatively little difference to the average citizen. Modern 
wars, on the other hand, require the cooperation of the entire population; 


they require a spirit of mass sacrifice, and are directly contradictory to the 
thesis that people are always individually selfish. 

These wars, Dulles points out, are made possible by the emotional 
character of human beings, through newly developed techniques for arous- 
ing emotional excitement in the general population. These mass emotions 
are related to the conception of the nation-hero and the other-nation- 
villain. 


The author then turns to positive recommendations and gives three which 
are definitely psychological in their nature. Of these, the first and prob- 
ably the most immediately practicable is that we should avoid exaggerating 
the villainous qualities of other nations. We have built up, in the minds 
of our people, an attitude of suspicion and distrust toward foreigners 
which is, of course, characteristically repeated in other modern nations. 
Thus, we interpret foreign actions as implying warlike tendencies toward 
us, and consequently we feel that we must defend ourselves. 

Secondly, he urges that we dilute the concept of our own nation as 4 
deity or hero. We must remember that all nations are pretty much alike, 
and that ours does not have unique divine qualities nor unanimously high 
ideals and impulses. The state must be regarded as merely a human 
device set up for facilitating certain kinds of human transactions. 
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Third, we must dilute the concept of the state as benefactor. We havea 
tendency to think of the state as being the only agent through which 
certain kinds of economic benefits can be achieved. Dulles argues that 
this attitude is likely to lead to an exaggeration of the importance of 
national honor and national power, which in turn contributes to interna- 
tional conflict. 

The fourth suggestion is of a more definitely political nature and relates 
to the facilitation of moderate change where this may possibly reduce the 
use of violence. For example, he points out that the refusal, twenty or 
thirty years ago, to allow the Japanese equal rights with European peoples, 
in the form of Japanese exclusion laws and similar measures, has con- 
tributed in no small degree to the present warlike temper among the 
leaders of the Japanese people. If small changes in the nature of privi- 
leges and prestige were granted to nations beginning to become dynamic, 
the cumulative quality of this tendency might be forestalled, and the 
violent results of interference avoided. 

Finally then, the author emphasizes that peace must not be identified 
with rigidity and the maintenance of the status quo. Peace, on the con- 
trary, is a constant state of change, but this change must be of such a 
nature as to be a happy compromise between the nations which hold large 
portions of the world’s resources and wealth, and those nations seeking to 
obtain a share of it for their own citizens. 

The book is very well written, clear, and throughout characterized by a 
simplicity of statement which is admirable. From a psychologist’s point 
of view it is better than a great deal which has been published in the field 
of international relations, although it does not come up to the precision of 
statement which we might like to see. 

The author does not attempt to treat of many problems which naturally 
occur to us, and he is very well justified in refusing to do so. Certainly, 
he could be accused of going outside his legitimate province if he were to 
attempt to discuss the kind of motives which operate in the controlling 
groups of the respectively dynamic or static nations, and similarly he 
would be outside his field of special training if he were to attempt to pass 
judgment on the reasons for the sacrificial tendency characterized by popu- 
lations under modern war-time conditions. 

All in all, I think it is definitely a contribution to the voluminous 
literature on this subject, which deserves careful reading by socially 
minded psychologists. His point with regard to the significance of the 
nation-hero and nation-villain concepts is particularly worth our considera- 
tion. It is just this matter of differences in national point of view which 
is most open to psychological study, and, perhaps in the long run, to 
psychological modification. 

Ross STAGNER, 
University of Akron. 
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ELIZABETH I. ADAMSON. So You’re Going to a Psychiatrist. New York: 
Thomas Y. Crowell Company, 1936. Pp. 263. 

The purpose of this contribution to popularized science as summed up 
by the author, Dr. Elizabeth Adamson, is as follows: ‘‘This book attempts 
the answer to a question I have often been asked, and have never been 
able to meet to my own satisfaction. . . . What books can we read, I was 
frequently asked, that will present, in simple language, an accurate account 
of modern psychiatric thought? ...I have tried, in non-technical lan- 
guage, to give what is now known, within the scope of psychiatry, about 
the reasons for success and failure in human adjustment.’’ 

The reader is initiated into psychiatric realms by means of such capti- 
vating chapter headings as, ‘‘ From Intuition to Intelligence,’’ ‘‘ Politics 
of the Mind,’’ ‘‘Running from Ghosts,’’ ‘‘The Baby’s Five-Year Pro- 
gram,’’ ‘‘Design for Immaturity,’’ and so on. The author conceives of 
psychiatry as ‘‘concerned with what doctors call personality deviations, 
whether mild or severe.’’ 

Let us not attempt to condemn by faint praise. The best that can be 
said for this book is that it is ‘‘easy reading’’ for the intelligent layman. 
In the opinion of this psychologist, Dr. Adamson has failed in her attempt 
to give an accurate account of modern psychiatric thought. If her repre- 
sentation of psychiatry is accurate, then much of psychiatry must be 
labeled as pure nonsense. It is said, for example, that the criminal char- 
acter is an expression of repressed childish conflicts and that he is the 
victim of retained infantile mechanisms, Further on we read that ‘‘ Prob- 
ably the most effective general approach to the problem of crime is to 
treat it as a contagious disorder acquired by children chiefly from parental 
‘carriers.’ ’’ Such statements, unsupported as they are by scientific data, 
suggest naiveté on the part of the author in accepting the doctrines of 
psychoanalysis without the critical point of view. 

The author has succeeded in accenting the psychoanalytic point of view 
rather than the psychiatric. Witness the interpretation of manic-depres- 
sive psychosis on page 48 where it is pointed out that those suffering from 
manic-depressive psychoses have personalities which yield first to the 
Super-Ego for a few weeks, months, or years, and then to their Id. Even 
the layman is forced to smile at the in absentia diagnoses which are made, 
especially that of the fictional character Don Juan who, it is said, is 
regarded by psychiatry as a latent homosexual. 

The reader gains the impression from the reading of this book that 
psychiatry is not.a recent development, highly speculative, but rather 4 
closed chapter of explanations, fully matured, not having required the 
binding ties of experimental methods, nor requiring them in the future. 
Statements are propounded as dogma; statistics are never cited. Terms 
are freely used without attempts at exact definition. Critical differences 
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between concepts are not delineated. Descriptive terms are used as 
explanatory concepts. There is no reference whatsoever to the vast fund 
of experimental data on learning. 


ANTHONY J. MITRANO, 
Psychological Corporation. 


Sara M, STINCHFIELD AND EpNA HILL Youna. Children with Delayed or 
Defective Speech. Stanford University: Stanford University Press, 
1938. Pp. XVI plus 174. $3.00. 

This book presents the practical and theoretical aspects of delayed and 
defective speech development. The first part dealing with the theoretical 
side has been written by Dr. Stinchfield. It is an account of the psycho- 
logical study of children with speech defects compared to children with 
normal speech. Although there is no supporting evidence, an important 
concept is introduced, that of speech readiness. It is probable that there 
is a psychological moment for the development of normal speech. This 
period is toward the end of the first year in the average child. After the 
second year, the most favorable period for speech development may be 
over. 

When the causes of defective speech are not hereditary or accident or 
injury, the author believes there are two causes of psychological nature. 
In one there is failure in the child’s perceptual development, the function- 
ing of the sensory-receptive speech areas, and transmission of impulses to 
motor areas in the brain. In the second there is lack of formation of 
associations required for the comprehension of meanings and for the recep- 
tion, retention, and reproduction of speech memories. 

Physical examination of twenty-three children with speech defects re- 
vealed no particularly significant factor which might have contributed to 
the development of defective speech. From the results obtained with the 
Stanford-Binet (44 cases), Kuhlmann (22 cases), Goodenough Drawing 
(16 cases), and Seguin Form Board (15 cases), it is assumed that mental 
deficiency is not the primary factor in causing delayed or defective speech 
development. The children did considerably better on the performance 
tests than on the Binet or Kuhlmann test. The mental tests also indicate 
that speech is somewhat more closely related to performance tests than 
to those involving language. The data suggest the use of a moto- 
kinesthetic approach to speech training in these children rather than the 
usual abstract, auditory, and visual appeal. Selected cases revealed that 
following special speech training there is an improvement in IQ, e.g., from 
feebleminded to average. 

Speech and audiometer tests were also administered. The speech tests 
revealed that boys made more errors than girls. The audiometer tests 
demonstrated that the children with defective and delayed speech received 
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lower scores than normal children. There was a direct relationship 
between loss in auditory acuity and defective speech. 

The second part dealing with practical therapy was written by Edna 
Hill Young. This section is of more immediate value to the teacher of 
children with defective speech. The principal contributions in the first 
part. were that mental deficiency was not a primary causal factor in delayed 
or defective speech and that reduced auditory acuity is related to defective 
speech. But the implication of these contributions could only be appre. 
ciated in the practical work of teaching children to speak. 

In this section, the author presents not only the general method of a 
moto-kinesthetic approach but actually describes the exercises necessary 
for the formation of the correct sounds of all the speech elements. There 
are also photographs of the correct formation of mouth and lips in order 
to produce these elemental speech sounds. To the speech therapist these 
exercises are invaluable. 

SYDNEY Ros.Low, 
The Psychological Corporation. 


ALEXANDRA ADLER. Guiding Human Misfits. New York: The Macmillan 
Co., 1938. Pp. 88. $1.75. 

The subtitle, ‘‘a practical application of individual psychology,’’ sets 
the limits to ‘‘human misfits.’’ According to the author, who is the 
daughter of Alfred Adler, ‘‘In dealing with mentally abnormal cases, 
individual psychology particularly concerns itself with difficulties in ‘prob- 
lem children’ and with neuroses, including problems of suicide, drunken- 
ness, drug addiction, sexual perversion, and lastly delinquency.’’ This 
small book is a practical essay on the clinical usefulness of ‘‘ individual 
psychology’’ in this sense. 

The author points out the paramount importance of childhood experi- 
ences in the development of personality difficulties. Such personality diff 
culties are known only by symptoms which always are meaningful in terms 
of the individual’s goal. Clinical study must discover these goals. By 
an apt device of running comment on a series of cases the author illustrates 
her method. 

While Adlerian ‘‘ individual psychology’’ furnishes the theoretical basis 
for the practical discussion, it does not intrude itself. The author’s varied 
clinical experience is evident in her practical advice to those dealing with 
personality problems. Inasmuch as half of the chapters are concerned 
with children’s problems, this volume should be studied by every psycholo- 
gist working with children. 

C. M. Lourtit, 
Indiana University. 
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RosertT C. CoLE. An Evaluation of the Vocational Guidance Program in 
the Worcester Boys’ Club. Worcester, Mass., 1939. 98 pp. 


In this careful and convincing study, Mr. Cole attempts to analyze the 
worth of the vocational guidance service of the Worcester Boys’ Club and 
to present a complete picture of statistical evidence to support his conclu- 
sions. The Boys’ Club is made up of some seven thousand boys who have 
limited advantages in life. The club maintains a complete program for 
them, including three individual service departments dealing with physical, 
vocational, and behavior problems. The department with which this study 
deals and of which Mr. Cole is director comprises a variety of educational 
and vocational classes, as well as a vocational guidance and placement 
service. 

In evaluating the worth of the department’s work over a period of time, 
Mr. Cole selected two hundred boys who were registered with the club in 
1931. Of this group, one hundred had had some vocational service from 
the club and had shown some interest in it, while the remaining one hun- 
dred had had no guidance and had signified no desire for such service. The 
boys were carefully matched according to age, IQ range, degree of school 
accomplishment, physical status, economic status, social status, interest, 
and ambition, so that the two groups were comparable in every respect. 

The status of these same boys in 1936 was carefully studied, including 
a detailed analysis of school status, vocational accomplishments, behavior, 
delinquency, and all factors pertinent to the study. Using various criteria, 
a detailed comparison is drawn between the status of the groups at the 
two periods observed. In school attainment, economic stability, earning 
capacity, type of occupations, social conduct and behavior, the group hav- 
ing had some vocational guidance showed quite an advantage over the 
group having had no service of this type. 

In his conclusions Mr. Cole critically surveys the procedure used, and 
anticipates negative reactions to the study. He defends such procedures 
as are defensible, and explains the necessity of the use of some few others. 
In his conclusions he avoids generalizations almost to the point of excessive 
repetition, but nevertheless builds therein a strong case for guidance. 
After summarizing the things which guidance did for the group who were 
helped in the Education and Guidance Department, he concludes: ‘‘. . . if 
guidance can do these things—and apparently it can—then it more than 
pays for itself. It brings large returns to the individual and to society. 
It is more than worthwhile and desirable. It is a necessity.’’ 

Dorotuy C. REECE, 
Ohio University. 
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